This is an archive of the discontinued LLVM Phabricator instance.

llvm/lib/Analysis/ValueTracking.cpp
2681–2683 ↗	(On Diff #85063)	I introduced the variable "CI" above, and I did not want to shadow it here. I could split out this variable rename into a separate patch if you really think that's necessary?

I think this is actually incorrect, see https://llvm.org/bugs/show_bug.cgi?id=31702#c1.

arsenm added inline comments.Jan 19 2017, 6:16 PM

llvm/lib/Analysis/ValueTracking.cpp
2681–2683 ↗	(On Diff #85063)	OK, that's fine

Update based on my understanding of what fabs really means, see https://llvm.org/bugs/show_bug.cgi?id=31702#c1.

Basically, I had my "assumption directions" wrong. IEEE 754 says the sign bit
of NaNs is unspecified, but the behavior of fabs on a NaN is to unset the sign
bit unconditionally. Therefore the correct thing is to assume that any NaN we
get has the sign bit set to 1! IOW the NaN sign bit is "might be anything",
not "undefined, we can pretend it's whatever we want."

Herald added a subscriber: wdng. · View Herald TranscriptJan 19 2017, 6:32 PM

Handle the possibility of negative NaNs correctly, and add tests.

jlebar added 1 blocking reviewer(s): efriedma.Jan 19 2017, 6:54 PM

This revision now requires review to proceed.Jan 19 2017, 6:54 PM

Eli, since I got this wrong once, would you mind having a look?

In D28928#651243, @jlebar wrote:

Eli, since I got this wrong once, would you mind having a look?

Eli, friendly ping?

efriedma added inline comments.Jan 24 2017, 11:38 AM

llvm/lib/Analysis/ValueTracking.cpp
2673 ↗	(On Diff #85090)	You need to prove that the result isn't a NaN and isn't -0. Using nsz to prove the result isn't -0 is suspicious. For nsz, LangRef says "Allow optimizations to treat the sign of a zero argument or result as insignificant.", so the operand could in fact be -0. Actually, sqrt(0) could produce -0 under that definition. Really, what we want to do is propagate in the fast-math flags from caller; we want to transform nnan nsz fabs(sqrt(x)) -> sqrt(x); that transform depends on the fast-math flags of the fabs, not the fast-math flags of the sqrt(). I think the root of this mess is "SignBitOnly": it doesn't really capture what we care about. We can split floating-point numbers into categories: +inf, -inf, +0, -0, positive finite, negative finite, negative nan, and positive nan. CannotBeOrderedLessThanZero needs to prove that the input is not in the category -inf or the category negative finite. SignBitMustBeZero needs to prove that the input is not in one of the categories -0, -inf, negative finite, or negative nan. To transform nnan fabs(x) -> x, we need to prove that the input is not in one of the categories -0, -inf, or negative finite. To transform nnan nsz fabs(x) -> x, we need to prove that the input is not in one of the categories -inf, or negative finite. To transform fast fabs(x) -> x, we need to prove the input is not in the category negative finite. Does this make sense, or is it completely crazy? :)

jlebar added inline comments.Jan 24 2017, 6:46 PM

llvm/lib/Analysis/ValueTracking.cpp
2673 ↗	(On Diff #85090)	For nsz, LangRef says "Allow optimizations to treat the sign of a zero argument or result as insignificant.", so the operand could in fact be -0. Actually, sqrt(0) could produce -0 under that definition. What I understand it means by "not significant" is that anywhere we have an nsz operation that produces or consumes a zero, we can assume that zero is positive or negative zero, at our convenience. So I agree `nsz sqrt(+0)` could return `-0` under this definition, but I think the point is that we get to choose. Reading it the opposite way -- that `nsz sqrt(+0)` might return `-0` and we have to account for this as a possibility -- would mean that `nsz` limits our optimization opportunities, which seems clearly not what it's intended to do. Really, what we want to do is propagate in the fast-math flags from caller; we want to transform nnan nsz fabs(sqrt(x)) -> sqrt(x); that transform depends on the fast-math flags of the fabs, not the fast-math flags of the sqrt(). I think this would be an additional correct optimization, but per above I don't think that what we have here is wrong. I think the root of this mess is "SignBitOnly": it doesn't really capture what we care about. [...] Does this make sense, or is it completely crazy? :) Makes a ton of sense! Just a few things to add, which we may already agree on: Per above, if the source of x is nsz or fast, we should be able to treat -0 and +0 as equivalent. If we can prove that x is either -0 or +0, then we can behave as though x is in both the -0 and +0 categories. (Similarly, if the source of x is nnan or fast, we can assume that x is not +/- nan, and if the source of x is ninf or fast, we can assume that x is not +/- inf.) If x is produced by some ieee754 operation other than copy, negate, abs, and copySign, and may be nan, we don't know whether it's positive or negative nan (unless we know something about the target).

hfinkel added inline comments.Jan 25 2017, 6:27 AM

llvm/lib/Analysis/ValueTracking.cpp
2673 ↗	(On Diff #85090)	Reading it the opposite way -- that nsz sqrt(+0) might return -0 and we have to account for this as a possibility -- would mean that nsz limits our optimization opportunities, which seems clearly not what it's intended to do. I agree. If the user passes -fno-signed-zeros, then they're saying that we can ignore the possibility of signed zeros when determining optimization legality. They might still exist, and the fact that they still exist might still be observable, but the user is promising that they don't care that this might be observed. One key question here is: Are we allowed to observe signed zeros inconsistently? If x == -0, for example, can `if (x != -0) print(x);` print -0? I think the answer must be yes. This code cannot be sensibly compiled under -fno-signed-zeros.

Reading it the opposite way -- that nsz sqrt(+0) might return -0 and we have to account for this as a possibility -- would mean that nsz *limits* our optimization opportunities, which seems clearly not what it's intended to do.

Well, if everything is nsz, that doesn't really come up; the user is under the same restriction as the sqrt() itself, so it doesn't limit optimization. The weird edge-case is where you have an nsz instruction which feeds into a non-nsz instruction: either we restrict optimization like you describe, or users of an nsz instruction are effectively nsz themselves. The latter seems worse... for example, if we started using SignBitMustBeZero to feed into integer KnownBits computations, we could have an integer value 0x80000000 whose top bit is "known" to be zero, which would lead to undefined behavior.

If x == -0, for example, can if (x != -0) print(x); print -0?

No, because 0. == -0. is true. :)

Maybe it should be renamed SignBitIsZeroOrInsignificant?

! In D28928#656636, @efriedma wrote:
The weird edge-case is where you have an nsz instruction which feeds into a non-nsz instruction: either we restrict optimization like you describe, or users of an nsz instruction are effectively nsz themselves.

Agreed, these are the two options. But I would amend the second option to say that the users of these instructions are "halfway nsz" themselves. Their inputs are nsz, but their outputs are not.

The latter seems worse... for example, if we started using SignBitMustBeZero to feed into integer KnownBits computations, we could have an integer value 0x80000000 whose top bit is "known" to be zero, which would lead to undefined behavior.

Are you worried about the case where we have something like

float f = nsz some_op;
int i = bitcast f to int
bool b = top bit of i

? I think as soon as we leave the fp space we can no longer play fast and loose with signed zeroes. Like, the result of the bitcast must have some value for its sign bit -- doesn't matter what it is because nsz -- and then we must stick with that, consistently.

If x == -0, for example, can if (x != -0) print(x); print -0?

No, because 0. == -0. is true. :)

Not sure if kidding around; if not, I think Hal was trying to ask: Can`if (x == 0 && !signbit(x)) print (x);` print "-0"? He's saying yes (if the whole program is compiled with nsz), and I think I agree.

I think as soon as we leave the fp space we can no longer play fast and loose with signed zeroes.

Agreed.

Can`if (x == 0 && !signbit(x)) print (x);` print "-0"? He's saying yes (if the whole program is compiled with nsz), and I think I agree.

I think it depends on how exactly "signbit" is implemented; if signbit and print are both integer operations, we probably have to be conservative because we're no longer in FP space. I think we can be more flexible with an nsz copysign, as in "if (x == 0 && copysign(1, x) == 1)".

I think we're basically in agreement here that this patch is OK?

efriedma added inline comments.Jan 25 2017, 3:03 PM

llvm/test/Transforms/InstSimplify/floating-point-arithmetic.ll
237 ↗	(On Diff #85090)	I'm not convinced this transform is legal without an nsz flag on the fabs... Patch is fine otherwise.

jlebar added inline comments.Jan 25 2017, 3:06 PM

llvm/test/Transforms/InstSimplify/floating-point-arithmetic.ll
237 ↗	(On Diff #85090)	I thought we talked about this above and established that, because fabs is a floating-point op and here its input comes from an nsz op, we can assume that the sign of the zero is "insignificant". My understanding was that only if we have a non-fp op involved do things become complicated. Maybe this would be easier on IRC? Please ping me and Hal.

jlebar added a reviewer: scanon.Jan 25 2017, 3:39 PM

efriedma accepted this revision.Jan 25 2017, 3:42 PM

efriedma added inline comments.

llvm/test/Transforms/InstSimplify/floating-point-arithmetic.ll
237 ↗	(On Diff #85090)	Here's what I understand how you're looking at this from this thread: nsz fabs() can return -0.0. bitcasting a float to an int "freezes" the sign bit (it can be either set or unset, but it's consistent). Floating-point operations whose operand is, directly or indirectly, an nsz instruction, have indeterminate signed zeros. I think 3 might be a bit too aggressive; it seems like a good idea to enforce the invariant that in non-fast-math code, fabs(x) never has the sign bit set. With this patch, that breaks if you use a library compiled with fast-math to produce x. That said, mixing fast-math and non-fast-math code is likely to be pretty infrequent, so maybe we shouldn't worry about it. Feel free to commit as-is.

This revision is now accepted and ready to land.Jan 25 2017, 3:42 PM

jlebar added inline comments.Jan 25 2017, 4:18 PM

llvm/test/Transforms/InstSimplify/floating-point-arithmetic.ll
237 ↗	(On Diff #85090)	Okay, will do, thanks. FWIW I totally see your point, and personally I don't feel particularly strongly about this. But I also don't think we have a particularly clear idea about how we want this to work at the moment. So...yeah. Thanks again.

jlebar mentioned this in D28797: [LangRef] Make @llvm.sqrt(x) return undef, rather than have UB, for negative x..Jan 25 2017, 4:18 PM

Closed by commit rL293115: [ValueTracking] Implement SignBitMustBeZero correctly for sqrt. (authored by jlebar). · Explain WhyJan 25 2017, 4:21 PM

This revision was automatically updated to reflect the committed changes.

spatel mentioned this in D57359: [GlobalISel] Introduce a G_FSQRT generic instruction.Jan 29 2019, 5:29 PM

Revision Contents

Path

Size

llvm/

trunk/

lib/

Analysis/

ValueTracking.cpp

16 lines

test/

Transforms/

InstSimplify/

floating-point-arithmetic.ll

54 lines

Diff 85834

llvm/trunk/lib/Analysis/ValueTracking.cpp

Show First 20 Lines • Show All 2,639 Lines • ▼ Show 20 Lines	return cannotBeOrderedLessThanZeroImpl(I->getOperand(1), TLI, SignBitOnly,
cannotBeOrderedLessThanZeroImpl(I->getOperand(2), TLI, SignBitOnly,		cannotBeOrderedLessThanZeroImpl(I->getOperand(2), TLI, SignBitOnly,
Depth + 1);		Depth + 1);
case Instruction::FPExt:		case Instruction::FPExt:
case Instruction::FPTrunc:		case Instruction::FPTrunc:
// Widening/narrowing never change sign.		// Widening/narrowing never change sign.
return cannotBeOrderedLessThanZeroImpl(I->getOperand(0), TLI, SignBitOnly,		return cannotBeOrderedLessThanZeroImpl(I->getOperand(0), TLI, SignBitOnly,
Depth + 1);		Depth + 1);
case Instruction::Call:		case Instruction::Call:
Intrinsic::ID IID = getIntrinsicForCallSite(cast<CallInst>(I), TLI);		const auto *CI = cast<CallInst>(I);
		Intrinsic::ID IID = getIntrinsicForCallSite(CI, TLI);
switch (IID) {		switch (IID) {
default:		default:
break;		break;
case Intrinsic::maxnum:		case Intrinsic::maxnum:
return cannotBeOrderedLessThanZeroImpl(I->getOperand(0), TLI, SignBitOnly,		return cannotBeOrderedLessThanZeroImpl(I->getOperand(0), TLI, SignBitOnly,
Depth + 1) \|\|		Depth + 1) \|\|
cannotBeOrderedLessThanZeroImpl(I->getOperand(1), TLI, SignBitOnly,		cannotBeOrderedLessThanZeroImpl(I->getOperand(1), TLI, SignBitOnly,
Depth + 1);		Depth + 1);
case Intrinsic::minnum:		case Intrinsic::minnum:
return cannotBeOrderedLessThanZeroImpl(I->getOperand(0), TLI, SignBitOnly,		return cannotBeOrderedLessThanZeroImpl(I->getOperand(0), TLI, SignBitOnly,
Depth + 1) &&		Depth + 1) &&
cannotBeOrderedLessThanZeroImpl(I->getOperand(1), TLI, SignBitOnly,		cannotBeOrderedLessThanZeroImpl(I->getOperand(1), TLI, SignBitOnly,
Depth + 1);		Depth + 1);
case Intrinsic::exp:		case Intrinsic::exp:
case Intrinsic::exp2:		case Intrinsic::exp2:
case Intrinsic::fabs:		case Intrinsic::fabs:
		return true;

case Intrinsic::sqrt:		case Intrinsic::sqrt:
		// sqrt(x) is always >= -0 or NaN. Moreover, sqrt(x) == -0 iff x == -0.
		if (!SignBitOnly)
return true;		return true;
		return CI->hasNoNaNs() && (CI->hasNoSignedZeros() \|\|
		CannotBeNegativeZero(CI->getOperand(0), TLI));

case Intrinsic::powi:		case Intrinsic::powi:
if (ConstantInt *CI = dyn_cast<ConstantInt>(I->getOperand(1))) {		if (ConstantInt *Exponent = dyn_cast<ConstantInt>(I->getOperand(1))) {
// powi(x,n) is non-negative if n is even.		// powi(x,n) is non-negative if n is even.
if (CI->getBitWidth() <= 64 && CI->getSExtValue() % 2u == 0)		if (Exponent->getBitWidth() <= 64 && Exponent->getSExtValue() % 2u == 0)
return true;		return true;
}		}
return cannotBeOrderedLessThanZeroImpl(I->getOperand(0), TLI, SignBitOnly,		return cannotBeOrderedLessThanZeroImpl(I->getOperand(0), TLI, SignBitOnly,
Depth + 1);		Depth + 1);
case Intrinsic::fma:		case Intrinsic::fma:
case Intrinsic::fmuladd:		case Intrinsic::fmuladd:
// x*x+y is non-negative if y is non-negative.		// x*x+y is non-negative if y is non-negative.
return I->getOperand(0) == I->getOperand(1) &&		return I->getOperand(0) == I->getOperand(1) &&
▲ Show 20 Lines • Show All 1,756 Lines • Show Last 20 Lines

llvm/trunk/test/Transforms/InstSimplify/floating-point-arithmetic.ll

Show First 20 Lines • Show All 98 Lines • ▼ Show 20 Lines	;
%4 = call float @sqrtf(float %3)		%4 = call float @sqrtf(float %3)
%5 = call float @sqrtf(float %4)		%5 = call float @sqrtf(float %4)
%6 = call float @sqrtf(float %5)		%6 = call float @sqrtf(float %5)
%7 = fadd float %6, 0.0		%7 = fadd float %6, 0.0
ret float %7		ret float %7
}		}

declare float @llvm.fabs.f32(float)		declare float @llvm.fabs.f32(float)
		declare float @llvm.sqrt.f32(float)

; CHECK-LABEL: @fabs_select_positive_constants(		; CHECK-LABEL: @fabs_select_positive_constants(
; CHECK: %select = select i1 %cmp, float 1.000000e+00, float 2.000000e+00		; CHECK: %select = select i1 %cmp, float 1.000000e+00, float 2.000000e+00
; CHECK-NEXT: ret float %select		; CHECK-NEXT: ret float %select
define float @fabs_select_positive_constants(i32 %c) {		define float @fabs_select_positive_constants(i32 %c) {
%cmp = icmp eq i32 %c, 0		%cmp = icmp eq i32 %c, 0
%select = select i1 %cmp, float 1.0, float 2.0		%select = select i1 %cmp, float 1.0, float 2.0
%fabs = call float @llvm.fabs.f32(float %select)		%fabs = call float @llvm.fabs.f32(float %select)
▲ Show 20 Lines • Show All 75 Lines • ▼ Show 20 Lines
; CHECK: %select = select i1 %cmp, float 0xFFF8000000000000, float 0.000000e+00		; CHECK: %select = select i1 %cmp, float 0xFFF8000000000000, float 0.000000e+00
; CHECK: %fabs = call float @llvm.fabs.f32(float %select)		; CHECK: %fabs = call float @llvm.fabs.f32(float %select)
define float @fabs_select_negnan_zero(float addrspace(1)* %out, i32 %c) {		define float @fabs_select_negnan_zero(float addrspace(1)* %out, i32 %c) {
%cmp = icmp eq i32 %c, 0		%cmp = icmp eq i32 %c, 0
%select = select i1 %cmp, float 0xFFF8000000000000, float 0.0		%select = select i1 %cmp, float 0xFFF8000000000000, float 0.0
%fabs = call float @llvm.fabs.f32(float %select)		%fabs = call float @llvm.fabs.f32(float %select)
ret float %fabs		ret float %fabs
}		}

		; CHECK-LABEL: @fabs_sqrt
		; CHECK: call float @llvm.sqrt.f32
		; CHECK: call float @llvm.fabs.f32
		define float @fabs_sqrt(float %a) {
		; The fabs can't be eliminated because llvm.sqrt.f32 may return -0 or NaN with
		; an arbitrary sign bit.
		%sqrt = call float @llvm.sqrt.f32(float %a)
		%fabs = call float @llvm.fabs.f32(float %sqrt)
		ret float %fabs
		}

		; CHECK-LABEL: @fabs_sqrt_nnan
		; CHECK: call nnan float @llvm.sqrt.f32
		; CHECK: call float @llvm.fabs.f32
		define float @fabs_sqrt_nnan(float %a) {
		; The fabs can't be eliminated because the nnan sqrt may still return -0.
		%sqrt = call nnan float @llvm.sqrt.f32(float %a)
		%fabs = call float @llvm.fabs.f32(float %sqrt)
		ret float %fabs
		}

		; CHECK-LABEL: @fabs_sqrt_nsz
		; CHECK: call nsz float @llvm.sqrt.f32
		; CHECK: call float @llvm.fabs.f32
		define float @fabs_sqrt_nsz(float %a) {
		; The fabs can't be eliminated because the nsz sqrt may still return NaN.
		%sqrt = call nsz float @llvm.sqrt.f32(float %a)
		%fabs = call float @llvm.fabs.f32(float %sqrt)
		ret float %fabs
		}

		; CHECK-LABEL: @fabs_sqrt_nnan_nsz
		; CHECK: call nnan nsz float @llvm.sqrt.f32
		; CHECK-NOT: call float @llvm.fabs.f32
		define float @fabs_sqrt_nnan_nsz(float %a) {
		; The fabs can be eliminated because we're nsz and nnan.
		%sqrt = call nnan nsz float @llvm.sqrt.f32(float %a)
		%fabs = call float @llvm.fabs.f32(float %sqrt)
		ret float %fabs
		}

		; CHECK-LABEL: @fabs_sqrt_nnan_fabs
		; CHECK: call float @llvm.fabs.f32
		; CHECK: call nnan float @llvm.sqrt.f32
		; CHECK-NOT: call float @llvm.fabs.f32
		define float @fabs_sqrt_nnan_fabs(float %a) {
		; The second fabs can be eliminated because the operand to sqrt cannot be -0.
		%b = call float @llvm.fabs.f32(float %a)
		%sqrt = call nnan float @llvm.sqrt.f32(float %b)
		%fabs = call float @llvm.fabs.f32(float %sqrt)
		ret float %fabs
		}

This is an archive of the discontinued LLVM Phabricator instance.

[ValueTracking] Implement SignBitMustBeZero correctly for sqrt.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 85834

llvm/trunk/lib/Analysis/ValueTracking.cpp

llvm/trunk/test/Transforms/InstSimplify/floating-point-arithmetic.ll

[ValueTracking] Implement SignBitMustBeZero correctly for sqrt.
ClosedPublic