This is an archive of the discontinued LLVM Phabricator instance.

[ConstProp] allow folding for fma that produces NaN
ClosedPublic

Authored by spatel on Sep 11 2019, 7:30 AM.

Download Raw Diff

Details

Reviewers

arsenm
fhahn
reames
hfinkel
cameron.mcinally

Commits

rG3f5a80836503: [ConstProp] allow folding for fma that produces NaN
rL371735: [ConstProp] allow folding for fma that produces NaN

Summary

Folding for fma/fmuladd was added here:
rL202914
...and as seen in existing/unchanged tests, that works to propagate NaN if it's already an input, but we should fold an fma() that creates NaN too.

From IEEE-754-2008 7.2 "Invalid Operation", there are 2 clauses that apply to fma, so I added tests for those patterns:

c) fusedMultiplyAdd: fusedMultiplyAdd(0, ∞, c) or fusedMultiplyAdd(∞, 0, c) unless c is a quiet NaN; if c is a quiet NaN then it is implementation defined whether the invalid operation exception is signaled
d) addition or subtraction or fusedMultiplyAdd: magnitude subtraction of infinities, such as: addition(+∞, −∞)

Diff Detail

Repository: rL LLVM

Event Timeline

spatel created this revision.Sep 11 2019, 7:30 AM

Herald added a project: Restricted Project. · View Herald TranscriptSep 11 2019, 7:30 AM

Herald added subscribers: hiraditya, wdng, mcrosier. · View Herald Transcript

spatel added a reviewer: cameron.mcinally.Sep 11 2019, 7:31 AM

LGTM

This revision is now accepted and ready to land.Sep 11 2019, 7:49 AM

cameron.mcinally added a subscriber: eli.friedman.Sep 11 2019, 8:03 AM

cameron.mcinally added inline comments.

llvm/test/Transforms/ConstProp/fma.ll
196 ↗	(On Diff #219711)	Should this class of FMA return poison? There's no valid result for fma(0, inf, C), where C != NaN. @eli.friedman

cameron.mcinally accepted this revision.Sep 11 2019, 8:06 AM

cameron.mcinally added inline comments.

llvm/test/Transforms/ConstProp/fma.ll

196 ↗

(On Diff #219711)

Bah, never mind. That only applies to converts that return integer results. FP results are defined as NaNs.

For operations producing results in floating-point format, the default result of an operation that signals the invalid operation exception shall be a quiet NaN that should provide some diagnostic information (see 6.2).

reames added inline comments.Sep 11 2019, 9:10 AM

llvm/lib/Analysis/ConstantFolding.cpp
2242 ↗	(On Diff #219711)	Does opInvalidOp always imply Nan? If so, then the name should be updated or at least clarifying comments added to the APFloat header. If not, then this code may be incorrect.

spatel marked an inline comment as done.Sep 11 2019, 9:52 AM

spatel added inline comments.

llvm/lib/Analysis/ConstantFolding.cpp
2242 ↗	(On Diff #219711)	From the perspective/usage of fusedMultiplyAdd() (and I think for any of the calls that produces an FP result), APFloat::opInvalidOp implies producing a NaN. In the general case, APFloat::opInvalidOp does not necessarily imply NaN because we use that status for non-FP APIs. For example, we have this for IEEEFloat::convertToInteger(): "we provide deterministic values in case of an invalid operation exception, namely zero for NaNs and the minimal or maximal value respectively for underflow or overflow." So I think this code is correct. Add something like the above text to the header comment?

cameron.mcinally marked an inline comment as done.Sep 11 2019, 10:01 AM

cameron.mcinally added inline comments.

llvm/lib/Analysis/ConstantFolding.cpp
2242 ↗	(On Diff #219711)	In the general case, APFloat::opInvalidOp does not necessarily imply NaN because we use that status for non-FP APIs. For example, we have this for IEEEFloat::convertToInteger(): "we provide deterministic values in case of an invalid operation exception, namely zero for NaNs and the minimal or maximal value respectively for underflow or overflow." There's been some discussion about returning poison for these cases wrt the constrained intrinsics. I don't know if a final decision was made. Just FYI.

spatel marked an inline comment as done.Sep 11 2019, 10:05 AM

spatel added inline comments.

llvm/lib/Analysis/ConstantFolding.cpp
2242 ↗	(On Diff #219711)	Also note from IEEE-754: "For operations producing results in floating-point format, the default result of an operation that signals the invalid operation exception shall be a quiet NaN." ...since APFloat models that spec, if fusedMultiplyAdd() or any other FP math raises invalidOp status, but then does not produce a NaN, I'd say that's a bug in APFloat.

Patch updated:
Added clarifying comment to APFloat header.

Herald added a subscriber: dexonsmith. · View Herald TranscriptSep 11 2019, 10:17 AM

cameron.mcinally marked an inline comment as done.Sep 11 2019, 11:09 AM

cameron.mcinally added inline comments.

llvm/lib/Analysis/ConstantFolding.cpp
2242 ↗	(On Diff #219711)	Digressing a bit, but APFloat does not handle some SNaNs operands appropriately either. A SNaN operand to a signaling operation should return an InvalidOp+QNaN. If I'm not mistaken, I've seen a few cases where SNaN operands return APFloat::opOk.

spatel marked an inline comment as done.Sep 12 2019, 6:09 AM

spatel added inline comments.

llvm/lib/Analysis/ConstantFolding.cpp

2242 ↗

(On Diff #219711)

Good to know; my guess is that NaN propagation hasn't been looked at closely given the non-strict approach of general LLVM IR.

I think this FMA case is working as expected though:

define double @inf_times_zero_plus_signalling_nan()  {
; CHECK-LABEL: @inf_times_zero_plus_signalling_nan(
; CHECK-NEXT:    ret double 0x7FF8000000000000
;
  %1 = call double @llvm.fma.f64(double 0x7FF0000000000000, double -0.0, double 0x7FF0000000000001)
  ret double %1
}

cameron.mcinally marked an inline comment as done.Sep 12 2019, 7:09 AM

cameron.mcinally added inline comments.

llvm/lib/Analysis/ConstantFolding.cpp
2242 ↗	(On Diff #219711)	IIRC, the result is correct, but APFloat returns an opOk. It should return an opInvalid. I can whip up a test case, if you'd like. In any event, the SNaN issue is a general problem. It shouldn't hold this patch up...

Closed by commit rL371735: [ConstProp] allow folding for fma that produces NaN (authored by spatel). · Explain WhySep 12 2019, 7:13 AM

This revision was automatically updated to reflect the committed changes.

Herald added a subscriber: kristina. · View Herald TranscriptSep 12 2019, 7:13 AM

spatel mentioned this in D67721: [InstSimplify] fold fma/fmuladd with a NaN operand.Sep 18 2019, 10:40 AM

spatel mentioned this in rL373455: [InstSimplify] fold fma/fmuladd with a NaN or undef operand.Oct 2 2019, 5:11 AM

spatel mentioned this in rGbe21ceb56597: [InstSimplify] fold fma/fmuladd with a NaN or undef operand.

Revision Contents

Path

Size

llvm/

trunk/

include/

llvm/

ADT/

APFloat.h

5 lines

lib/

Analysis/

ConstantFolding.cpp

10 lines

test/

Transforms/

ConstProp/

fma.ll

37 lines

Diff 219910

llvm/trunk/include/llvm/ADT/APFloat.h

Show First 20 Lines • Show All 186 Lines • ▼ Show 20 Lines	enum roundingMode {
rmTowardNegative,		rmTowardNegative,
rmTowardZero,		rmTowardZero,
rmNearestTiesToAway		rmNearestTiesToAway
};		};

/// IEEE-754R 7: Default exception handling.		/// IEEE-754R 7: Default exception handling.
///		///
/// opUnderflow or opOverflow are always returned or-ed with opInexact.		/// opUnderflow or opOverflow are always returned or-ed with opInexact.
		///
		/// APFloat models this behavior specified by IEEE-754:
		/// "For operations producing results in floating-point format, the default
		/// result of an operation that signals the invalid operation exception
		/// shall be a quiet NaN."
enum opStatus {		enum opStatus {
opOK = 0x00,		opOK = 0x00,
opInvalidOp = 0x01,		opInvalidOp = 0x01,
opDivByZero = 0x02,		opDivByZero = 0x02,
opOverflow = 0x04,		opOverflow = 0x04,
opUnderflow = 0x08,		opUnderflow = 0x08,
opInexact = 0x10		opInexact = 0x10
};		};
▲ Show 20 Lines • Show All 1,083 Lines • Show Last 20 Lines

llvm/trunk/lib/Analysis/ConstantFolding.cpp

Show First 20 Lines • Show All 2,237 Lines • ▼ Show 20 Lines	static Constant *ConstantFoldScalarCall3(StringRef Name,
if (const auto *Op1 = dyn_cast<ConstantFP>(Operands[0])) {		if (const auto *Op1 = dyn_cast<ConstantFP>(Operands[0])) {
if (const auto *Op2 = dyn_cast<ConstantFP>(Operands[1])) {		if (const auto *Op2 = dyn_cast<ConstantFP>(Operands[1])) {
if (const auto *Op3 = dyn_cast<ConstantFP>(Operands[2])) {		if (const auto *Op3 = dyn_cast<ConstantFP>(Operands[2])) {
switch (IntrinsicID) {		switch (IntrinsicID) {
default: break;		default: break;
case Intrinsic::fma:		case Intrinsic::fma:
case Intrinsic::fmuladd: {		case Intrinsic::fmuladd: {
APFloat V = Op1->getValueAPF();		APFloat V = Op1->getValueAPF();
APFloat::opStatus s = V.fusedMultiplyAdd(Op2->getValueAPF(),		V.fusedMultiplyAdd(Op2->getValueAPF(), Op3->getValueAPF(),
Op3->getValueAPF(),
APFloat::rmNearestTiesToEven);		APFloat::rmNearestTiesToEven);
if (s != APFloat::opInvalidOp)
return ConstantFP::get(Ty->getContext(), V);		return ConstantFP::get(Ty->getContext(), V);

return nullptr;
}		}
}		}
}		}
}		}
}		}

if (const auto *Op1 = dyn_cast<ConstantInt>(Operands[0])) {		if (const auto *Op1 = dyn_cast<ConstantInt>(Operands[0])) {
if (const auto *Op2 = dyn_cast<ConstantInt>(Operands[1])) {		if (const auto *Op2 = dyn_cast<ConstantInt>(Operands[1])) {
▲ Show 20 Lines • Show All 335 Lines • Show Last 20 Lines

llvm/trunk/test/Transforms/ConstProp/fma.ll

Show First 20 Lines • Show All 123 Lines • ▼ Show 20 Lines	;
%1 = call double @llvm.fma.f64(double 7.0, double 0xFFF0000000000000, double 0.0)		%1 = call double @llvm.fma.f64(double 7.0, double 0xFFF0000000000000, double 0.0)
ret double %1		ret double %1
}		}

; -inf + inf --> NaN		; -inf + inf --> NaN

define double @inf_product_opposite_inf_addend_1() {		define double @inf_product_opposite_inf_addend_1() {
; CHECK-LABEL: @inf_product_opposite_inf_addend_1(		; CHECK-LABEL: @inf_product_opposite_inf_addend_1(
; CHECK-NEXT: [[TMP1:%.*]] = call double @llvm.fma.f64(double 7.000000e+00, double 0xFFF0000000000000, double 0x7FF0000000000000)		; CHECK-NEXT: ret double 0x7FF8000000000000
; CHECK-NEXT: ret double [[TMP1]]
;		;
%1 = call double @llvm.fma.f64(double 7.0, double 0xFFF0000000000000, double 0x7FF0000000000000)		%1 = call double @llvm.fma.f64(double 7.0, double 0xFFF0000000000000, double 0x7FF0000000000000)
ret double %1		ret double %1
}		}

; inf + -inf --> NaN		; inf + -inf --> NaN

define double @inf_product_opposite_inf_addend_2() {		define double @inf_product_opposite_inf_addend_2() {
; CHECK-LABEL: @inf_product_opposite_inf_addend_2(		; CHECK-LABEL: @inf_product_opposite_inf_addend_2(
; CHECK-NEXT: [[TMP1:%.*]] = call double @llvm.fma.f64(double 7.000000e+00, double 0x7FF0000000000000, double 0xFFF0000000000000)		; CHECK-NEXT: ret double 0x7FF8000000000000
; CHECK-NEXT: ret double [[TMP1]]
;		;
%1 = call double @llvm.fma.f64(double 7.0, double 0x7FF0000000000000, double 0xFFF0000000000000)		%1 = call double @llvm.fma.f64(double 7.0, double 0x7FF0000000000000, double 0xFFF0000000000000)
ret double %1		ret double %1
}		}

; -inf + inf --> NaN		; -inf + inf --> NaN

define double @inf_product_opposite_inf_addend_3() {		define double @inf_product_opposite_inf_addend_3() {
; CHECK-LABEL: @inf_product_opposite_inf_addend_3(		; CHECK-LABEL: @inf_product_opposite_inf_addend_3(
; CHECK-NEXT: [[TMP1:%.*]] = call double @llvm.fma.f64(double 0xFFF0000000000000, double 4.200000e+01, double 0x7FF0000000000000)		; CHECK-NEXT: ret double 0x7FF8000000000000
; CHECK-NEXT: ret double [[TMP1]]
;		;
%1 = call double @llvm.fma.f64(double 0xFFF0000000000000, double 42.0, double 0x7FF0000000000000)		%1 = call double @llvm.fma.f64(double 0xFFF0000000000000, double 42.0, double 0x7FF0000000000000)
ret double %1		ret double %1
}		}

; inf + -inf --> NaN		; inf + -inf --> NaN

define double @inf_product_opposite_inf_addend_4() {		define double @inf_product_opposite_inf_addend_4() {
; CHECK-LABEL: @inf_product_opposite_inf_addend_4(		; CHECK-LABEL: @inf_product_opposite_inf_addend_4(
; CHECK-NEXT: [[TMP1:%.*]] = call double @llvm.fma.f64(double 0x7FF0000000000000, double 4.200000e+01, double 0xFFF0000000000000)		; CHECK-NEXT: ret double 0x7FF8000000000000
; CHECK-NEXT: ret double [[TMP1]]
;		;
%1 = call double @llvm.fma.f64(double 0x7FF0000000000000, double 42.0, double 0xFFF0000000000000)		%1 = call double @llvm.fma.f64(double 0x7FF0000000000000, double 42.0, double 0xFFF0000000000000)
ret double %1		ret double %1
}		}

; 0 * -inf --> NaN		; 0 * -inf --> NaN

define double @inf_times_zero_1() {		define double @inf_times_zero_1() {
; CHECK-LABEL: @inf_times_zero_1(		; CHECK-LABEL: @inf_times_zero_1(
; CHECK-NEXT: [[TMP1:%.*]] = call double @llvm.fma.f64(double 0.000000e+00, double 0xFFF0000000000000, double 4.200000e+01)		; CHECK-NEXT: ret double 0x7FF8000000000000
; CHECK-NEXT: ret double [[TMP1]]
;		;
%1 = call double @llvm.fma.f64(double 0.0, double 0xFFF0000000000000, double 42.0)		%1 = call double @llvm.fma.f64(double 0.0, double 0xFFF0000000000000, double 42.0)
ret double %1		ret double %1
}		}

; 0 * inf --> NaN		; 0 * inf --> NaN

define double @inf_times_zero_2() {		define double @inf_times_zero_2() {
; CHECK-LABEL: @inf_times_zero_2(		; CHECK-LABEL: @inf_times_zero_2(
; CHECK-NEXT: [[TMP1:%.*]] = call double @llvm.fma.f64(double 0.000000e+00, double 0x7FF0000000000000, double 4.200000e+01)		; CHECK-NEXT: ret double 0x7FF8000000000000
; CHECK-NEXT: ret double [[TMP1]]
;		;
%1 = call double @llvm.fma.f64(double 0.0, double 0x7FF0000000000000, double 42.0)		%1 = call double @llvm.fma.f64(double 0.0, double 0x7FF0000000000000, double 42.0)
ret double %1		ret double %1
}		}

; -inf * 0 --> NaN		; -inf * 0 --> NaN

define double @inf_times_zero_3() {		define double @inf_times_zero_3() {
; CHECK-LABEL: @inf_times_zero_3(		; CHECK-LABEL: @inf_times_zero_3(
; CHECK-NEXT: [[TMP1:%.*]] = call double @llvm.fma.f64(double 0xFFF0000000000000, double 0.000000e+00, double 4.200000e+01)		; CHECK-NEXT: ret double 0x7FF8000000000000
; CHECK-NEXT: ret double [[TMP1]]
;		;
%1 = call double @llvm.fma.f64(double 0xFFF0000000000000, double 0.0, double 42.0)		%1 = call double @llvm.fma.f64(double 0xFFF0000000000000, double 0.0, double 42.0)
ret double %1		ret double %1
}		}

; inf * 0 --> NaN		; inf * 0 --> NaN

define double @inf_times_zero_4() {		define double @inf_times_zero_4() {
; CHECK-LABEL: @inf_times_zero_4(		; CHECK-LABEL: @inf_times_zero_4(
; CHECK-NEXT: [[TMP1:%.*]] = call double @llvm.fma.f64(double 0x7FF0000000000000, double 0.000000e+00, double 4.200000e+01)		; CHECK-NEXT: ret double 0x7FF8000000000000
; CHECK-NEXT: ret double [[TMP1]]
;		;
%1 = call double @llvm.fma.f64(double 0x7FF0000000000000, double 0.0, double 42.0)		%1 = call double @llvm.fma.f64(double 0x7FF0000000000000, double 0.0, double 42.0)
ret double %1		ret double %1
}		}

; -0 * -inf --> NaN		; -0 * -inf --> NaN

define double @inf_times_zero_5() {		define double @inf_times_zero_5() {
; CHECK-LABEL: @inf_times_zero_5(		; CHECK-LABEL: @inf_times_zero_5(
; CHECK-NEXT: [[TMP1:%.*]] = call double @llvm.fma.f64(double -0.000000e+00, double 0xFFF0000000000000, double 4.200000e+01)		; CHECK-NEXT: ret double 0x7FF8000000000000
; CHECK-NEXT: ret double [[TMP1]]
;		;
%1 = call double @llvm.fma.f64(double -0.0, double 0xFFF0000000000000, double 42.0)		%1 = call double @llvm.fma.f64(double -0.0, double 0xFFF0000000000000, double 42.0)
ret double %1		ret double %1
}		}

; -0 * inf --> NaN		; -0 * inf --> NaN

define double @inf_times_zero_6() {		define double @inf_times_zero_6() {
; CHECK-LABEL: @inf_times_zero_6(		; CHECK-LABEL: @inf_times_zero_6(
; CHECK-NEXT: [[TMP1:%.*]] = call double @llvm.fma.f64(double -0.000000e+00, double 0x7FF0000000000000, double 4.200000e+01)		; CHECK-NEXT: ret double 0x7FF8000000000000
; CHECK-NEXT: ret double [[TMP1]]
;		;
%1 = call double @llvm.fma.f64(double -0.0, double 0x7FF0000000000000, double 42.0)		%1 = call double @llvm.fma.f64(double -0.0, double 0x7FF0000000000000, double 42.0)
ret double %1		ret double %1
}		}

; -inf * -0 --> NaN		; -inf * -0 --> NaN

define double @inf_times_zero_7() {		define double @inf_times_zero_7() {
; CHECK-LABEL: @inf_times_zero_7(		; CHECK-LABEL: @inf_times_zero_7(
; CHECK-NEXT: [[TMP1:%.*]] = call double @llvm.fma.f64(double 0xFFF0000000000000, double -0.000000e+00, double 4.200000e+01)		; CHECK-NEXT: ret double 0x7FF8000000000000
; CHECK-NEXT: ret double [[TMP1]]
;		;
%1 = call double @llvm.fma.f64(double 0xFFF0000000000000, double -0.0, double 42.0)		%1 = call double @llvm.fma.f64(double 0xFFF0000000000000, double -0.0, double 42.0)
ret double %1		ret double %1
}		}

; inf * -0 --> NaN		; inf * -0 --> NaN

define double @inf_times_zero_8() {		define double @inf_times_zero_8() {
; CHECK-LABEL: @inf_times_zero_8(		; CHECK-LABEL: @inf_times_zero_8(
; CHECK-NEXT: [[TMP1:%.*]] = call double @llvm.fma.f64(double 0x7FF0000000000000, double -0.000000e+00, double 4.200000e+01)		; CHECK-NEXT: ret double 0x7FF8000000000000
; CHECK-NEXT: ret double [[TMP1]]
;		;
%1 = call double @llvm.fma.f64(double 0x7FF0000000000000, double -0.0, double 42.0)		%1 = call double @llvm.fma.f64(double 0x7FF0000000000000, double -0.0, double 42.0)
ret double %1		ret double %1
}		}

This is an archive of the discontinued LLVM Phabricator instance.

[ConstProp] allow folding for fma that produces NaNClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 219910

llvm/trunk/include/llvm/ADT/APFloat.h

llvm/trunk/lib/Analysis/ConstantFolding.cpp

llvm/trunk/test/Transforms/ConstProp/fma.ll

[ConstProp] allow folding for fma that produces NaN
ClosedPublic