This is an archive of the discontinued LLVM Phabricator instance.

llvm/test/Transforms/InstCombine/double-float-shrink-1.ll
363 ↗	(On Diff #158403)	It looks like optimizeBinaryDoubleFP doesn't do all the same checks optimizeUnaryDoubleFP does? optimizeUnaryDoubleFP intentionally chooses not to optimize operations where the result isn't truncated because the extra bits might actually be useful.

evandro added inline comments.Jul 31 2018, 4:00 PM

llvm/test/Transforms/InstCombine/double-float-shrink-1.ll
363 ↗	(On Diff #158403)	Arguably, this should not shrink.

evandro added inline comments.Jul 31 2018, 4:07 PM

llvm/test/Transforms/InstCombine/double-float-shrink-1.ll
363 ↗	(On Diff #158403)	Besides here, `optimizeBinaryDoubleFP()` is also used for `f{min,max}()` and `copysign()`. Neither of the latter functions need the extra bits in the result.

spatel added inline comments.Jul 31 2018, 4:07 PM

llvm/test/Transforms/InstCombine/double-float-shrink-1.ll
363 ↗	(On Diff #158403)	An argument in favor of shrinking: it may reduce the number of instructions as shown in this test.

evandro added inline comments.Jul 31 2018, 4:14 PM

llvm/test/Transforms/InstCombine/double-float-shrink-1.ll
363 ↗	(On Diff #158403)	And the series in `powf()` is probably shorter and faster than in `pow()`.

evandro updated this revision to Diff 158435.Jul 31 2018, 6:07 PM

evandro edited the summary of this revision. (Show Details)

I think we should commit this based on fixing the miscompile alone; we can debate the other diff in a follow-on patch if needed.
But let's see if @efriedma agrees.

In D50113#1184429, @spatel wrote:

I think we should commit this based on fixing the miscompile alone; we can debate the other diff in a follow-on patch if needed.

The miscompile was never noticed because the shrinking was always discarded. It seems sensible to me that both are fixed in one swoop.

evandro added inline comments.Aug 2 2018, 9:10 AM

llvm/test/Transforms/InstCombine/double-float-shrink-1.ll
363 ↗	(On Diff #158403)	On the other hand, this shrink is only performed when `-enable-double-float-shrink` is specified or with FMF, which implies the attribute `afn`.

I'd like to get this right... both the CheckRetType check, and the "infinite loop" check from optimizeUnaryDoubleFP are probably relevant. Can we share code between the two functions?

Maybe the CheckRetType check isn't critical if we're only doing this transform by default when afn is enabled, but it's still losing a lot of bits, particularly if the input is supposed to be an exact number. pow(2.f, 0.5f) is a lot different from (double)powf(2.f, 0.5f).

(If you want to fix the potential miscompile for 7.0, I'd accept a patch to just completely disable shrinking for pow.)

In D50113#1186263, @efriedma wrote:

I'd like to get this right... both the CheckRetType check, and the "infinite loop" check from optimizeUnaryDoubleFP are probably relevant. Can we share code between the two functions?

Maybe the CheckRetType check isn't critical if we're only doing this transform by default when afn is enabled, but it's still losing a lot of bits, particularly if the input is supposed to be an exact number. pow(2.f, 0.5f) is a lot different from (double)powf(2.f, 0.5f).

(If you want to fix the potential miscompile for 7.0, I'd accept a patch to just completely disable shrinking for pow.)

Just to make it clear, CheckRetType only matters for optimizeUnaryDoubleFP(); optimizeBinaryDoubleFP() doesn't have.

I agree with you, but wonder how this could ever work. It didn't, so probably nobody ever used it. So, why was it added? If it has no defensible purpose, then, by all means, it should be dropped.

So, why was it added?

I would assume someone added it alongside a bunch of other operations where it matters more, and didn't write enough tests.

In general, the point of the transform is to catch code where someone accidentally writes "float z = pow(x, y);" instead of "float z = powf(x, y);" (And sometimes, it isn't practical to fix the code.)

Alternatively, optimizeBinaryDoubleFP() could be fixed to shrink just float f = pow(), while keeping its other users, f{min,max}() and copysign() happy, which should be easy to satisfy, given how simple these functions are.

Refactored the helper functions for shrinking unary and binary functions into a single one, while keeping all their functionality, in rL338905.

Keep the double precision pow() when the result is too.

LGTM

This revision is now accepted and ready to land.Aug 3 2018, 2:48 PM

Closed by commit rL339046: [SLC] Fix shrinking of pow() (authored by evandro). · Explain WhyAug 6 2018, 12:40 PM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

llvm/

trunk/

lib/

Transforms/

Utils/

SimplifyLibCalls.cpp

30 lines

test/

Transforms/

InstCombine/

double-float-shrink-1.ll

12 lines

Diff 159356

llvm/trunk/lib/Transforms/Utils/SimplifyLibCalls.cpp

Show First 20 Lines • Show All 959 Lines • ▼ Show 20 Lines	if (ConstantFP *Const = dyn_cast<ConstantFP>(Val)) {
if (!losesInfo)		if (!losesInfo)
return ConstantFP::get(Const->getContext(), F);		return ConstantFP::get(Const->getContext(), F);
}		}
return nullptr;		return nullptr;
}		}

/// Shrink double -> float functions.		/// Shrink double -> float functions.
static Value optimizeDoubleFP(CallInst CI, IRBuilder<> &B,		static Value optimizeDoubleFP(CallInst CI, IRBuilder<> &B,
bool isBinary, bool doResultCheck = false) {		bool isBinary, bool isPrecise = false) {
if (!CI->getType()->isDoubleTy())		if (!CI->getType()->isDoubleTy())
return nullptr;		return nullptr;

// Check if all the uses of the function are converted to float.		// If not all the uses of the function are converted to float, then bail out.
if (doResultCheck)		// This matters if the precision of the result is more important than the
		// precision of the arguments.
		if (isPrecise)
for (User *U : CI->users()) {		for (User *U : CI->users()) {
FPTruncInst *Cast = dyn_cast<FPTruncInst>(U);		FPTruncInst *Cast = dyn_cast<FPTruncInst>(U);
if (!Cast \|\| !Cast->getType()->isFloatTy())		if (!Cast \|\| !Cast->getType()->isFloatTy())
return nullptr;		return nullptr;
}		}

// If this is something like 'g((double) float)', convert to 'gf(float)'.		// If this is something like 'g((double) float)', convert to 'gf(float)'.
Value *V[2];		Value *V[2];
Show All 35 Lines	else
R = isBinary ? emitBinaryFloatFnCall(V[0], V[1], CalleeNm, B, CalleeAt)		R = isBinary ? emitBinaryFloatFnCall(V[0], V[1], CalleeNm, B, CalleeAt)
: emitUnaryFloatFnCall(V[0], CalleeNm, B, CalleeAt);		: emitUnaryFloatFnCall(V[0], CalleeNm, B, CalleeAt);

return B.CreateFPExt(R, B.getDoubleTy());		return B.CreateFPExt(R, B.getDoubleTy());
}		}

/// Shrink double -> float for unary functions.		/// Shrink double -> float for unary functions.
static Value optimizeUnaryDoubleFP(CallInst CI, IRBuilder<> &B,		static Value optimizeUnaryDoubleFP(CallInst CI, IRBuilder<> &B,
bool doResultCheck = false) {		bool isPrecise = false) {
return optimizeDoubleFP(CI, B, false, doResultCheck);		return optimizeDoubleFP(CI, B, false, isPrecise);
}		}

/// Shrink double -> float for binary functions.		/// Shrink double -> float for binary functions.
static Value optimizeBinaryDoubleFP(CallInst CI, IRBuilder<> &B,		static Value optimizeBinaryDoubleFP(CallInst CI, IRBuilder<> &B,
bool doResultCheck = false) {		bool isPrecise = false) {
return optimizeDoubleFP(CI, B, true, doResultCheck);		return optimizeDoubleFP(CI, B, true, isPrecise);
}		}

// cabs(z) -> sqrt((creal(z)creal(z)) + (cimag(z)cimag(z)))		// cabs(z) -> sqrt((creal(z)creal(z)) + (cimag(z)cimag(z)))
Value LibCallSimplifier::optimizeCAbs(CallInst CI, IRBuilder<> &B) {		Value LibCallSimplifier::optimizeCAbs(CallInst CI, IRBuilder<> &B) {
if (!CI->isFast())		if (!CI->isFast())
return nullptr;		return nullptr;

// Propagate fast-math flags from the existing call to new instructions.		// Propagate fast-math flags from the existing call to new instructions.
▲ Show 20 Lines • Show All 104 Lines • ▼ Show 20 Lines	Value LibCallSimplifier::optimizePow(CallInst Pow, IRBuilder<> &B) {
Function *Callee = Pow->getCalledFunction();		Function *Callee = Pow->getCalledFunction();
AttributeList Attrs = Callee->getAttributes();		AttributeList Attrs = Callee->getAttributes();
StringRef Name = Callee->getName();		StringRef Name = Callee->getName();
Module *Module = Pow->getModule();		Module *Module = Pow->getModule();
Type *Ty = Pow->getType();		Type *Ty = Pow->getType();
Value *Shrunk = nullptr;		Value *Shrunk = nullptr;
bool Ignored;		bool Ignored;

if (UnsafeFPShrink &&
Name == TLI->getName(LibFunc_pow) && hasFloatVersion(Name))
Shrunk = optimizeUnaryDoubleFP(Pow, B, true);

// Propagate the math semantics from the call to any created instructions.		// Propagate the math semantics from the call to any created instructions.
IRBuilder<>::FastMathFlagGuard Guard(B);		IRBuilder<>::FastMathFlagGuard Guard(B);
B.setFastMathFlags(Pow->getFastMathFlags());		B.setFastMathFlags(Pow->getFastMathFlags());

		// Shrink pow() to powf() if the arguments are single precision,
		// unless the result is expected to be double precision.
		if (UnsafeFPShrink &&
		Name == TLI->getName(LibFunc_pow) && hasFloatVersion(Name))
		Shrunk = optimizeBinaryDoubleFP(Pow, B, true);

// Evaluate special cases related to the base.		// Evaluate special cases related to the base.

// pow(1.0, x) -> 1.0		// pow(1.0, x) -> 1.0
if (match(Base, m_FPOne()))		if (match(Base, m_FPOne()))
return Base;		return Base;

// pow(2.0, x) -> exp2(x)		// pow(2.0, x) -> exp2(x)
if (match(Base, m_SpecificFP(2.0))) {		if (match(Base, m_SpecificFP(2.0))) {
▲ Show 20 Lines • Show All 78 Lines • ▼ Show 20 Lines	Value LibCallSimplifier::optimizePow(CallInst Pow, IRBuilder<> &B) {
// pow(x, n) -> x * x * x * ....		// pow(x, n) -> x * x * x * ....
if (Pow->isFast()) {		if (Pow->isFast()) {
APFloat ExpoA = abs(ExpoC->getValueAPF());		APFloat ExpoA = abs(ExpoC->getValueAPF());
// We limit to a max of 7 fmul(s). Thus the maximum exponent is 32.		// We limit to a max of 7 fmul(s). Thus the maximum exponent is 32.
// This transformation applies to integer exponents only.		// This transformation applies to integer exponents only.
if (!ExpoA.isInteger() \|\|		if (!ExpoA.isInteger() \|\|
ExpoA.compare		ExpoA.compare
(APFloat(ExpoA.getSemantics(), 32.0)) == APFloat::cmpGreaterThan)		(APFloat(ExpoA.getSemantics(), 32.0)) == APFloat::cmpGreaterThan)
return nullptr;		return Shrunk;

// We will memoize intermediate products of the Addition Chain.		// We will memoize intermediate products of the Addition Chain.
Value *InnerChain[33] = {nullptr};		Value *InnerChain[33] = {nullptr};
InnerChain[1] = Base;		InnerChain[1] = Base;
InnerChain[2] = B.CreateFMul(Base, Base, "square");		InnerChain[2] = B.CreateFMul(Base, Base, "square");

// We cannot readily convert a non-double type (like float) to a double.		// We cannot readily convert a non-double type (like float) to a double.
// So we first convert it to something which could be converted to double.		// So we first convert it to something which could be converted to double.
ExpoA.convert(APFloat::IEEEdouble(), APFloat::rmTowardZero, &Ignored);		ExpoA.convert(APFloat::IEEEdouble(), APFloat::rmTowardZero, &Ignored);
Value *FMul = getPow(InnerChain, ExpoA.convertToDouble(), B);		Value *FMul = getPow(InnerChain, ExpoA.convertToDouble(), B);

// If the exponent is negative, then get the reciprocal.		// If the exponent is negative, then get the reciprocal.
if (ExpoC->isNegative())		if (ExpoC->isNegative())
FMul = B.CreateFDiv(ConstantFP::get(Ty, 1.0), FMul, "reciprocal");		FMul = B.CreateFDiv(ConstantFP::get(Ty, 1.0), FMul, "reciprocal");
return FMul;		return FMul;
}		}

return nullptr;		return Shrunk;
}		}

Value LibCallSimplifier::optimizeExp2(CallInst CI, IRBuilder<> &B) {		Value LibCallSimplifier::optimizeExp2(CallInst CI, IRBuilder<> &B) {
Function *Callee = CI->getCalledFunction();		Function *Callee = CI->getCalledFunction();
Value *Ret = nullptr;		Value *Ret = nullptr;
StringRef Name = Callee->getName();		StringRef Name = Callee->getName();
if (UnsafeFPShrink && Name == "exp2" && hasFloatVersion(Name))		if (UnsafeFPShrink && Name == "exp2" && hasFloatVersion(Name))
Ret = optimizeUnaryDoubleFP(CI, B, true);		Ret = optimizeUnaryDoubleFP(CI, B, true);
▲ Show 20 Lines • Show All 1,418 Lines • Show Last 20 Lines

llvm/trunk/test/Transforms/InstCombine/double-float-shrink-1.ll

	Show First 20 Lines • Show All 338 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: [[CALL:%.*]] = call fast double @logb(double [[CONV]])			; CHECK-NEXT: [[CALL:%.*]] = call fast double @logb(double [[CONV]])
	; CHECK-NEXT: ret double [[CALL]]			; CHECK-NEXT: ret double [[CALL]]
	;			;
	%conv = fpext float %f to double			%conv = fpext float %f to double
	%call = call fast double @logb(double %conv)			%call = call fast double @logb(double %conv)
	ret double %call			ret double %call
	}			}

	; FIXME: Miscompile - we dropped the 2nd argument!

	define float @pow_test1(float %f, float %g) {			define float @pow_test1(float %f, float %g) {
	; CHECK-LABEL: @pow_test1(			; CHECK-LABEL: @pow_test1(
	; CHECK-NEXT: [[POWF:%.]] = call fast float @powf(float [[F:%.]])			; CHECK-NEXT: [[POWF:%.*]] = call fast float @powf(float %f, float %g)
	; CHECK-NEXT: ret float [[POWF]]			; CHECK-NEXT: ret float [[POWF]]
	;			;
	%df = fpext float %f to double			%df = fpext float %f to double
	%dg = fpext float %g to double			%dg = fpext float %g to double
	%call = call fast double @pow(double %df, double %dg)			%call = call fast double @pow(double %df, double %dg)
	%fr = fptrunc double %call to float			%fr = fptrunc double %call to float
	ret float %fr			ret float %fr
	}			}

	; TODO: This should shrink?

	define double @pow_test2(float %f, float %g) {			define double @pow_test2(float %f, float %g) {
	; CHECK-LABEL: @pow_test2(			; CHECK-LABEL: @pow_test2(
	; CHECK-NEXT: [[DF:%.]] = fpext float [[F:%.]] to double			; CHECK: [[POW:%.*]] = call fast double @pow(double %df, double %dg)
	; CHECK-NEXT: [[DG:%.]] = fpext float [[G:%.]] to double			; CHECK-NEXT: ret double [[POW]]
	; CHECK-NEXT: [[CALL:%.*]] = call fast double @pow(double [[DF]], double [[DG]])
	; CHECK-NEXT: ret double [[CALL]]
	;			;
	%df = fpext float %f to double			%df = fpext float %f to double
	%dg = fpext float %g to double			%dg = fpext float %g to double
	%call = call fast double @pow(double %df, double %dg)			%call = call fast double @pow(double %df, double %dg)
	ret double %call			ret double %call
	}			}

	define float @sin_test1(float %f) {			define float @sin_test1(float %f) {
	▲ Show 20 Lines • Show All 170 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[SLC] Fix shrinking of pow()ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 159356

llvm/trunk/lib/Transforms/Utils/SimplifyLibCalls.cpp

llvm/trunk/test/Transforms/InstCombine/double-float-shrink-1.ll

[SLC] Fix shrinking of pow()
ClosedPublic