This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/trunk/
-
trunk/
-
docs/
-
LangRef.rst
-
lib/Transforms/Utils/
-
Transforms/
-
Utils/
-
SimplifyLibCalls.cpp

Differential D28797

[LangRef] Make @llvm.sqrt(x) return undef, rather than have UB, for negative x.
ClosedPublic

Authored by jlebar on Jan 17 2017, 12:15 AM.

Download Raw Diff

Details

Reviewers

sanjoy
mehdi_amini
arsenm
hfinkel
efriedma

Commits

rGcb9b41dd767c: [LangRef] Make @llvm.sqrt(x) return undef, rather than have UB, for negative x.
rL293242: [LangRef] Make @llvm.sqrt(x) return undef, rather than have UB, for negative x.

Summary

Some frontends emit a speculate-and-select idiom for sqrt, wherein they compute
sqrt(x), check if x is negative, and select NaN if it is:

%cmp = fcmp olt double %a, -0.000000e+00
%sqrt = call double @llvm.sqrt.f64(double %a)
%ret = select i1 %cmp, double 0x7FF8000000000000, double %sqrt

This is technically UB as the LangRef is written today if %a is ever less than
-0. But emitting code that's compliant with the current definition of sqrt
would require a branch, which would then prevent us from matching this idiom in
SelectionDAG (which we do today -- ISD::FSQRT has defined behavior on negative
inputs), because SelectionDAG looks at one BB at a time.

Nothing in LLVM takes advantage of this undefined behavior, as far as we can
tell, and the fact that llvm.sqrt has UB dates from its initial addition to the
LangRef.

Diff Detail

Repository: rL LLVM

Event Timeline

jlebar created this revision.Jan 17 2017, 12:15 AM

Herald added a subscriber: wdng. · View Herald TranscriptJan 17 2017, 12:15 AM

jlebar added a child revision: D28793: [NVPTX] Auto-upgrade some NVPTX intrinsics to LLVM target-generic code..Jan 17 2017, 12:16 AM

LGTM.
Maybe sanity check with Sanjoy?

llvm/docs/LangRef.rst
10064 ↗	(On Diff #84640)	I believe that usually it spells "returns `undef`" or "the result is undefined".

This revision is now accepted and ready to land.Jan 17 2017, 8:13 AM

This LGTM; especially since llvm::isSafeToSpeculativelyExecute already returns true for calls to llvm.sqrt (so if some place in LLVM thinks llvm.sqrt may have UB, we're already miscompiling).

Nothing in LLVM takes advantage of this undefined behavior, as far as we can tell, and the fact that llvm.sqrt has UB dates from its initial addition to the LangRef.

This is false: we take advantage of this to lower @llvm.sqrt() to libm sqrt() on platforms which don't have a native sqrt instruction, and that can have side-effects. See also https://reviews.llvm.org/D28335 .

This revision now requires changes to proceed.Jan 17 2017, 11:44 AM

In D28797#648535, @efriedma wrote:

This is false: we take advantage of this to lower @llvm.sqrt() to libm sqrt() on platforms which don't have a native sqrt instruction, and that can have side-effects. See also https://reviews.llvm.org/D28335 .

Sigh.

We can lower it to if (x >= -0) libm_sqrt(x) else NaN? It's already broken per Sanjoy's comment.

I would like to make this langref change because I want to convert the opaque target-specific llvm.nvvm.sqrt.f intrinsic to something involving llvm.sqrt.f32 (so that all of our sqrt-optimization machinery can be brought to bear).

In D28797#648535, @efriedma wrote:

Nothing in LLVM takes advantage of this undefined behavior, as far as we can tell, and the fact that llvm.sqrt has UB dates from its initial addition to the LangRef.

This is false: we take advantage of this to lower @llvm.sqrt() to libm sqrt() on platforms which don't have a native sqrt instruction, and that can have side-effects. See also https://reviews.llvm.org/D28335 .

What about NaN? A NaN input is legal according to LangRef AFAICT, and errno would still be set. That would lead me to think that lowering unconditionally to libm sqrt() isn't correct.

In D28797#648535, @efriedma wrote:

Nothing in LLVM takes advantage of this undefined behavior, as far as we can tell, and the fact that llvm.sqrt has UB dates from its initial addition to the LangRef.

This is false: we take advantage of this to lower @llvm.sqrt() to libm sqrt() on platforms which don't have a native sqrt instruction, and that can have side-effects. See also https://reviews.llvm.org/D28335 .

We do, but this seems somewhat accidental. Most of the libm intrinsics have this problem, and sqrt is the only one for which we have this undefined behavior.

We can lower it to if (x >= -0) libm_sqrt(x) else NaN?

That would be correct. It's not particularly efficient, but that's not important as long as we aren't unconditionally transforming @sqrt() to @llvm.sqrt().

We do, but this seems somewhat accidental. Most of the libm intrinsics have this problem, and sqrt is the only one for which we have this undefined behavior.

Indeed. Now that I look at LegalizeDAG, I don't think it makes a lot of sense to legalize ISD::FSQRT to if (x >= -0) libm_sqrt(x) else NaN while leaving every other ExpandFPLibCall alone.

This langref change at least brings sqrt into line with the rest of the llvm math functions, and makes the legalization code as broken for it as it is for most everything else.

Indeed, one could argue that the issue in the legalization code is unrelated to this issue in the langref. ISD::FSQRT has defined behavior on negative values. The fact that we're incorrectly legalizing it to libm_sqrt arguably has nothing to do with the semantics of @llvm.sqrt.f32.

@efriedma, are you OK with this change going in, on that basis?

I'm not sure I really like the logic of "all these other intrinsics are broken, therefore we should break llvm.sqrt()", especially since we don't really use most of the intrinsics in question.

If we're going to ignore the problem, at least get rid of the "(which allows for better optimization, because there is no need to worry about errno being set)" bit.

On a mostly unrelated note, making llvm.sqrt() return undef rather than NaN for negative numbers is completely useless; we might as well just say that @llvm.sqrt is precisely the IEEE 754 squareRoot() operation minus any side-effects. We can mark a call "nnan" to indicate that the backend doesn't need to worry about negative numbers and nans.

In D28797#648781, @efriedma wrote:

I'm not sure I really like the logic of "all these other intrinsics are broken, therefore we should break llvm.sqrt()", especially since we don't really use most of the intrinsics in question.

If we're going to ignore the problem, at least get rid of the "(which allows for better optimization, because there is no need to worry about errno being set)" bit.

On a mostly unrelated note, making llvm.sqrt() return undef rather than NaN for negative numbers is completely useless; we might as well just say that @llvm.sqrt is precisely the IEEE 754 squareRoot() operation minus any side-effects. We can mark a call "nnan" to indicate that the backend doesn't need to worry about negative numbers and nans.

Yes! Please just remove it.

I had been under some impression that we'd have a potential problem doing this because the corresponding SDAG node inherits these undef properties, and who knows what all of the target lowering does... but it doesn't: SelectionDAGBuilder::visitCall has this:

case LibFunc::sqrt:
case LibFunc::sqrtf:
case LibFunc::sqrtl:
case LibFunc::sqrt_finite:
case LibFunc::sqrtf_finite:
case LibFunc::sqrtl_finite:
  if (visitUnaryFloatCall(I, ISD::FSQRT))
    return;
  break;

so this is just an IR issue. Let's rationalize this.

Update per Eli and Mehdi's comments.

Harbormaster completed remote builds in B3006: Diff 84744.Jan 17 2017, 2:26 PM

I'm not sure I really like the logic of "all these other intrinsics are broken, therefore we should break llvm.sqrt()", especially since we don't really use most of the intrinsics in question.

My thinking is more along the lines of, "all of these intrinsics, including sqrt, are broken. Therefore we should at least bring the langref more in line with (a) our desired semantics and (b) the semantics that various frontends are currently using."

The reason that even sqrt is currently broken is -- at least as I understand -- the speculate-and-select IR idiom is lowered to ISD::FSQRT, which is then sometimes legalized into libm_sqrt, which may set errno. It's true that according to the langref as it exists you shouldn't write sqrt speculate-and-select, but as Sanjoy said, we mark sqrt as safe-to-speculate, so, who knows what the optimizer may do.

If we're going to ignore the problem, at least get rid of the "(which allows for better optimization, because there is no need to worry about errno being set)" bit.

Done.

Edited because phab decided to post an old version of my comment.

On a mostly unrelated note, making llvm.sqrt() return undef rather than NaN for negative numbers is completely useless; we might as well just say that @llvm.sqrt is precisely the IEEE 754 squareRoot() operation minus any side-effects. We can mark a call "nnan" to indicate that the backend doesn't need to worry about negative numbers and nans.

No response to this?

In D28797#648865, @efriedma wrote:

On a mostly unrelated note, making llvm.sqrt() return undef rather than NaN for negative numbers is completely useless; we might as well just say that @llvm.sqrt is precisely the IEEE 754 squareRoot() operation minus any side-effects. We can mark a call "nnan" to indicate that the backend doesn't need to worry about negative numbers and nans.

No response to this?

Discussed on IRC with Hal, he convinced me to try to make LLVM think that llvm.sqrt returns NaN for negative inputs, and then of course we can update the langref as you suggest.

No promises I'll be able to make it work. And with the changes I plan, we will continue to have the problem of legalizing llvm.sqrt to libm_sqrt, incorrectly adding side-effects to the function call, as we do for most of the other intrinsics.

Fine, I guess... I'll try to find time to fix llvm.sqrt() lowering.

This revision is now accepted and ready to land.Jan 17 2017, 2:45 PM

jlebar planned changes to this revision.Jan 17 2017, 4:07 PM

Re-requesting review on this. I've added as deps the only changes I believe we need to make in order to make our handling of llvm.sqrt as correct as our handling of the other transcendental intrinsics.

jlebar requested review of this revision.Jan 19 2017, 4:52 PM

This revision is now accepted and ready to land.Jan 19 2017, 4:52 PM

Is everyone happy with this change? I'm about to land D28928, which is the last prerequisite to this.

Hearing no objections, I'm pushing this now.

Closed by commit rL293242: [LangRef] Make @llvm.sqrt(x) return undef, rather than have UB, for negative x. (authored by jlebar). · Explain WhyJan 26 2017, 5:09 PM

This revision was automatically updated to reflect the committed changes.

spatel mentioned this in D39204: [CodeGen] map sqrt libcalls to llvm.sqrt when errno is not set.Oct 31 2017, 12:45 PM

spatel mentioned this in rL317031: [CodeGen] map sqrt libcalls to llvm.sqrt when errno is not set.Oct 31 2017, 1:20 PM

spatel mentioned this in D57359: [GlobalISel] Introduce a G_FSQRT generic instruction.Jan 29 2019, 5:29 PM

Revision Contents

Path

Size

llvm/

trunk/

docs/

LangRef.rst

7 lines

lib/

Transforms/

Utils/

SimplifyLibCalls.cpp

19 lines

Diff 85990

llvm/trunk/docs/LangRef.rst

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 10,070 Lines • ▼ Show 20 Lines	::
declare x86_fp80 @llvm.sqrt.f80(x86_fp80 %Val)		declare x86_fp80 @llvm.sqrt.f80(x86_fp80 %Val)
declare fp128 @llvm.sqrt.f128(fp128 %Val)		declare fp128 @llvm.sqrt.f128(fp128 %Val)
declare ppc_fp128 @llvm.sqrt.ppcf128(ppc_fp128 %Val)		declare ppc_fp128 @llvm.sqrt.ppcf128(ppc_fp128 %Val)

Overview:		Overview:
"""""""""		"""""""""

The '``llvm.sqrt``' intrinsics return the sqrt of the specified operand,		The '``llvm.sqrt``' intrinsics return the sqrt of the specified operand,
returning the same value as the libm '``sqrt``' functions would. Unlike		returning the same value as the libm '``sqrt``' functions would, but without
``sqrt`` in libm, however, ``llvm.sqrt`` has undefined behavior for		trapping or setting ``errno``.
negative numbers other than -0.0 (which allows for better optimization,
because there is no need to worry about errno being set).
``llvm.sqrt(-0.0)`` is defined to return -0.0 like IEEE sqrt.

Arguments:		Arguments:
""""""""""		""""""""""

The argument and return value are floating point numbers of the same		The argument and return value are floating point numbers of the same
type.		type.

Semantics:		Semantics:
▲ Show 20 Lines • Show All 2,937 Lines • Show Last 20 Lines

llvm/trunk/lib/Transforms/Utils/SimplifyLibCalls.cpp

Show First 20 Lines • Show All 1,091 Lines • ▼ Show 20 Lines	if (Op2C->isExactlyValue(-0.5) &&
hasUnaryFloatFn(TLI, Op2->getType(), LibFunc_sqrt, LibFunc_sqrtf,		hasUnaryFloatFn(TLI, Op2->getType(), LibFunc_sqrt, LibFunc_sqrtf,
LibFunc_sqrtl)) {		LibFunc_sqrtl)) {
// If -ffast-math:		// If -ffast-math:
// pow(x, -0.5) -> 1.0 / sqrt(x)		// pow(x, -0.5) -> 1.0 / sqrt(x)
if (CI->hasUnsafeAlgebra()) {		if (CI->hasUnsafeAlgebra()) {
IRBuilder<>::FastMathFlagGuard Guard(B);		IRBuilder<>::FastMathFlagGuard Guard(B);
B.setFastMathFlags(CI->getFastMathFlags());		B.setFastMathFlags(CI->getFastMathFlags());

// Here we cannot lower to an intrinsic because C99 sqrt() and llvm.sqrt		// TODO: If the pow call is an intrinsic, we should lower to the sqrt
// are not guaranteed to have the same semantics.		// intrinsic, so we match errno semantics. We also should check that the
		// target can in fact lower the sqrt intrinsic -- we currently have no way
		// to ask this question other than asking whether the target has a sqrt
		// libcall, which is a sufficient but not necessary condition.
Value *Sqrt = emitUnaryFloatFnCall(Op1, TLI->getName(LibFunc_sqrt), B,		Value *Sqrt = emitUnaryFloatFnCall(Op1, TLI->getName(LibFunc_sqrt), B,
Callee->getAttributes());		Callee->getAttributes());

return B.CreateFDiv(ConstantFP::get(CI->getType(), 1.0), Sqrt, "sqrtrecip");		return B.CreateFDiv(ConstantFP::get(CI->getType(), 1.0), Sqrt, "sqrtrecip");
}		}
}		}

if (Op2C->isExactlyValue(0.5) &&		if (Op2C->isExactlyValue(0.5) &&
hasUnaryFloatFn(TLI, Op2->getType(), LibFunc_sqrt, LibFunc_sqrtf,		hasUnaryFloatFn(TLI, Op2->getType(), LibFunc_sqrt, LibFunc_sqrtf,
LibFunc_sqrtl)) {		LibFunc_sqrtl)) {

// In -ffast-math, pow(x, 0.5) -> sqrt(x).		// In -ffast-math, pow(x, 0.5) -> sqrt(x).
if (CI->hasUnsafeAlgebra()) {		if (CI->hasUnsafeAlgebra()) {
IRBuilder<>::FastMathFlagGuard Guard(B);		IRBuilder<>::FastMathFlagGuard Guard(B);
B.setFastMathFlags(CI->getFastMathFlags());		B.setFastMathFlags(CI->getFastMathFlags());

// Unlike other math intrinsics, sqrt has differerent semantics		// TODO: As above, we should lower to the sqrt intrinsic if the pow is an
// from the libc function. See LangRef for details.		// intrinsic, to match errno semantics.
return emitUnaryFloatFnCall(Op1, TLI->getName(LibFunc_sqrt), B,		return emitUnaryFloatFnCall(Op1, TLI->getName(LibFunc_sqrt), B,
Callee->getAttributes());		Callee->getAttributes());
}		}

// Expand pow(x, 0.5) to (x == -infinity ? +infinity : fabs(sqrt(x))).		// Expand pow(x, 0.5) to (x == -infinity ? +infinity : fabs(sqrt(x))).
// This is faster than calling pow, and still handles negative zero		// This is faster than calling pow, and still handles negative zero
// and negative infinity correctly.		// and negative infinity correctly.
// TODO: In finite-only mode, this could be just fabs(sqrt(x)).		// TODO: In finite-only mode, this could be just fabs(sqrt(x)).
Value *Inf = ConstantFP::getInfinity(CI->getType());		Value *Inf = ConstantFP::getInfinity(CI->getType());
Value *NegInf = ConstantFP::getInfinity(CI->getType(), true);		Value *NegInf = ConstantFP::getInfinity(CI->getType(), true);

		// TODO: As above, we should lower to the sqrt intrinsic if the pow is an
		// intrinsic, to match errno semantics.
Value *Sqrt = emitUnaryFloatFnCall(Op1, "sqrt", B, Callee->getAttributes());		Value *Sqrt = emitUnaryFloatFnCall(Op1, "sqrt", B, Callee->getAttributes());

Module *M = Callee->getParent();		Module *M = Callee->getParent();
Function *FabsF = Intrinsic::getDeclaration(M, Intrinsic::fabs,		Function *FabsF = Intrinsic::getDeclaration(M, Intrinsic::fabs,
CI->getType());		CI->getType());
Value *FAbs = B.CreateCall(FabsF, Sqrt);		Value *FAbs = B.CreateCall(FabsF, Sqrt);

Value *FCmp = B.CreateFCmpOEQ(Op1, NegInf);		Value *FCmp = B.CreateFCmpOEQ(Op1, NegInf);
▲ Show 20 Lines • Show All 166 Lines • ▼ Show 20 Lines	return B.CreateFMul(
Callee->getName(), B, Callee->getAttributes()),		Callee->getName(), B, Callee->getAttributes()),
"logmul");		"logmul");
return Ret;		return Ret;
}		}

Value LibCallSimplifier::optimizeSqrt(CallInst CI, IRBuilder<> &B) {		Value LibCallSimplifier::optimizeSqrt(CallInst CI, IRBuilder<> &B) {
Function *Callee = CI->getCalledFunction();		Function *Callee = CI->getCalledFunction();
Value *Ret = nullptr;		Value *Ret = nullptr;
		// TODO: Once we have a way (other than checking for the existince of the
		// libcall) to tell whether our target can lower @llvm.sqrt, relax the
		// condition below.
if (TLI->has(LibFunc_sqrtf) && (Callee->getName() == "sqrt" \|\|		if (TLI->has(LibFunc_sqrtf) && (Callee->getName() == "sqrt" \|\|
Callee->getIntrinsicID() == Intrinsic::sqrt))		Callee->getIntrinsicID() == Intrinsic::sqrt))
Ret = optimizeUnaryDoubleFP(CI, B, true);		Ret = optimizeUnaryDoubleFP(CI, B, true);

if (!CI->hasUnsafeAlgebra())		if (!CI->hasUnsafeAlgebra())
return Ret;		return Ret;

Instruction *I = dyn_cast<Instruction>(CI->getArgOperand(0));		Instruction *I = dyn_cast<Instruction>(CI->getArgOperand(0));
if (!I \|\| I->getOpcode() != Instruction::FMul \|\| !I->hasUnsafeAlgebra())		if (!I \|\| I->getOpcode() != Instruction::FMul \|\| !I->hasUnsafeAlgebra())
return Ret;		return Ret;
▲ Show 20 Lines • Show All 1,083 Lines • Show Last 20 Lines