Download Raw Diff

Details

Reviewers

spatel
efriedma

Summary

This transformation helps some benchmarks in SPEC CPU200 and CPU2006, such as 188.ammp, 447.dealII, 453.povray, and especially 300.twolf, as well as some proprietary benchmarks. Otherwise, no regressions on x86-64 or A64.

Diff Detail

Event Timeline

evandro created this revision.Jul 6 2018, 1:38 PM

Herald added subscribers: llvm-commits, hiraditya. · View Herald TranscriptJul 6 2018, 1:38 PM

Tests?

llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp
1138–1139	There seems to be two variants: https://godbolt.org/g/Rw1Gxt Can you output the value that doesn't match?

Missing testcases.

I'm not sure what you mean by "a hard time matching the exponent"; I can't see any reason float would be different from double, assuming you're actually using the right constant.

llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp
1125	You need nsz: `pow(-0., 1./3)` returns +0, but `cbrt(-0.)` returns -0. I think I'd prefer to require afn for this; not sure it's necessary, but better to be safe. Please add explicit comments explaining why you need nnan and ninf (nnan because pow() returns a nan for negative x, ninf for `pow(-inf, 1./3)`).

Shouldn't this patch use the existing code in replacePowWithSqrt(), so we're not (incompletely) duplicating the logic?

That code also has a TODO comment about choosing the minimal set of FMF to enable the fold. Whatever we decide that predicate will be should be identical for both transforms?

In D49040#1155206, @spatel wrote:

Shouldn't this patch use the existing code in replacePowWithSqrt(), so we're not (incompletely) duplicating the logic?

That code also has a TODO comment about choosing the minimal set of FMF to enable the fold. Whatever we decide that predicate will be should be identical for both transforms?

The logic is almost the same, except that sqrt() has a corresponding intrinsic and cbrt() doesn't. So. at elast for now, methinks that it's easier to understand and review this change as a separate function. Then, if needed, both functions could be merged.

Add a test case.

evandro updated this revision to Diff 154624.Jul 9 2018, 9:25 AM

evandro added inline comments.Jul 9 2018, 10:48 AM

llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp
1138–1139	Please, see any example using `float` in the test case below.

evandro marked 2 inline comments as done.Jul 9 2018, 12:11 PM

evandro added inline comments.

llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp
1138–1139	My bad. I crafted the test case using the IEEE754 bits for SP instead of the bits for DP truncated for `float`.

Enable handling of float.

lebedev.ri added inline comments.Jul 9 2018, 12:13 PM

llvm/test/Transforms/InstCombine/pow-cbrt.ll
1	Just use `./utils/update_test_checks.py`

efriedma added inline comments.Jul 9 2018, 12:20 PM

llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp
1125	isFast() is deprecated, because it makes the actual requirements unclear and disables optimizations where it isn't necessary. (In particular, you don't need reassoc here.)

evandro marked 2 inline comments as done.Jul 9 2018, 12:38 PM

evandro updated this revision to Diff 154679.Jul 9 2018, 12:41 PM

Refactor all of the code that simplifies pow(x, 0.5) to sqrt() by folding it into a new function that handles all simplifications to radical operations.

Allow splat vectors for trivial simplifications.

evandro added a child revision: D49273: [InstCombine] Expand the simplification of pow() into exp2().Jul 12 2018, 4:07 PM

evandro added a child revision: D49306: [SLC] Simplify pow(x, 0.25) to sqrt(sqrt(x)).Jul 13 2018, 10:09 AM

Ping! 🔔

¡Ping! 🔔🔔

If I'm seeing it correctly, there are several independent changes going on here. Can you split this up to make the review easier?

Add all new tests with baseline CHECKs as an NFC preliminary step.
It's not clear to me what the FMF diffs in pow-sqrt.ll are showing. If we are adjusting FMF constraints on existing folds, that should be an independent patch?
The cosmetic diffs in variable names (Base, Expo, Shrunk, etc) look fine, so those can be another preliminary NFC commit before the part that needs further review.
The cbrt transform can be the last patch/commit in this series; once we have the cleanup done, that should be a very small diff.

OK. Please, stay tuned.

evandro mentioned this in rL338152: [SLC] Test simplification of pow(x, 0.333...) to cbrt(x) (NFC).Jul 27 2018, 11:57 AM

Committed rL338152 to add the base line test case pow-cbrt.ll.

evandro added a parent revision: D50036: [SLC] Expand the simplification of pow(x, 0.5) to sqrt(x).Jul 30 2018, 7:01 PM

evandro updated this revision to Diff 158160.Jul 30 2018, 7:03 PM

evandro edited reviewers, added: efriedma; removed: davide, beanz.Jul 30 2018, 7:07 PM

evandro edited subscribers, added: davide, beanz; removed: efriedma.

evandro updated this revision to Diff 158438.Jul 31 2018, 6:13 PM

evandro edited the summary of this revision. (Show Details)

lebedev.ri mentioned this in D49306: [SLC] Simplify pow(x, 0.25) to sqrt(sqrt(x)).Jul 31 2018, 11:16 PM

Ping! 🔔

As we discussed in D49306, I agree that this is probably a good perf transform, but I don't think we've shown any compelling reason to do this in IR vs. DAGCombiner.

There are downsides to doing this in IR currently because we don't have a cbrt intrinsic. That means we have different behavior based on data type (vectors won't get transformed).
It's also not clear why transforming to a form with fdiv in the negative exponent case is better than a single pow instruction. (And that case probably needs some perf justification even as a backend fold.)

OK, then.

evandro removed a child revision: D49273: [InstCombine] Expand the simplification of pow() into exp2().Aug 9 2018, 2:23 PM

evandro added a subscriber: fhahn.Aug 17 2018, 8:02 AM

spatel mentioned this in D51753: [DAGCombiner] try to convert pow(x, 1/3) to cbrt(x).Sep 6 2018, 2:16 PM

spatel mentioned this in rL342348: [DAGCombiner] try to convert pow(x, 1/3) to cbrt(x).Sep 16 2018, 9:51 AM

Diff 158438

llvm/include/llvm/Transforms/Utils/SimplifyLibCalls.h

Show First 20 Lines • Show All 127 Lines • ▼ Show 20 Lines	private:
Value optimizeWcslen(CallInst CI, IRBuilder<> &B);		Value optimizeWcslen(CallInst CI, IRBuilder<> &B);
// Wrapper for all String/Memory Library Call Optimizations		// Wrapper for all String/Memory Library Call Optimizations
Value optimizeStringMemoryLibCall(CallInst CI, IRBuilder<> &B);		Value optimizeStringMemoryLibCall(CallInst CI, IRBuilder<> &B);

// Math Library Optimizations		// Math Library Optimizations
Value optimizeCAbs(CallInst CI, IRBuilder<> &B);		Value optimizeCAbs(CallInst CI, IRBuilder<> &B);
Value optimizeCos(CallInst CI, IRBuilder<> &B);		Value optimizeCos(CallInst CI, IRBuilder<> &B);
Value optimizePow(CallInst CI, IRBuilder<> &B);		Value optimizePow(CallInst CI, IRBuilder<> &B);
Value replacePowWithSqrt(CallInst Pow, IRBuilder<> &B);		Value replacePowWithRoot(CallInst Pow, IRBuilder<> &B);
Value optimizeExp2(CallInst CI, IRBuilder<> &B);		Value optimizeExp2(CallInst CI, IRBuilder<> &B);
Value optimizeFMinFMax(CallInst CI, IRBuilder<> &B);		Value optimizeFMinFMax(CallInst CI, IRBuilder<> &B);
Value optimizeLog(CallInst CI, IRBuilder<> &B);		Value optimizeLog(CallInst CI, IRBuilder<> &B);
Value optimizeSqrt(CallInst CI, IRBuilder<> &B);		Value optimizeSqrt(CallInst CI, IRBuilder<> &B);
Value optimizeSinCosPi(CallInst CI, IRBuilder<> &B);		Value optimizeSinCosPi(CallInst CI, IRBuilder<> &B);
Value optimizeTan(CallInst CI, IRBuilder<> &B);		Value optimizeTan(CallInst CI, IRBuilder<> &B);
// Wrapper for all floating point library call optimizations		// Wrapper for all floating point library call optimizations
Value optimizeFloatingPointLibCall(CallInst CI, LibFunc Func,		Value optimizeFloatingPointLibCall(CallInst CI, LibFunc Func,
▲ Show 20 Lines • Show All 48 Lines • Show Last 20 Lines

llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp

Show First 20 Lines • Show All 1,113 Lines • ▼ Show 20 Lines	static const unsigned AddChain[33][2] = {
{3, 24}, {14, 14}, {4, 25}, {15, 15}, {3, 28}, {16, 16},		{3, 24}, {14, 14}, {4, 25}, {15, 15}, {3, 28}, {16, 16},
};		};

InnerChain[Exp] = B.CreateFMul(getPow(InnerChain, AddChain[Exp][0], B),		InnerChain[Exp] = B.CreateFMul(getPow(InnerChain, AddChain[Exp][0], B),
getPow(InnerChain, AddChain[Exp][1], B));		getPow(InnerChain, AddChain[Exp][1], B));
return InnerChain[Exp];		return InnerChain[Exp];
}		}

/// Use square root in place of pow(x, +/-0.5).		/// Use sqrt() for pow(x, +/-0.5) and cbrt() for pow(x, +/-0.333...).
Value LibCallSimplifier::replacePowWithSqrt(CallInst Pow, IRBuilder<> &B) {		Value LibCallSimplifier::replacePowWithRoot(CallInst Pow, IRBuilder<> &B) {
Value Sqrt, Base = Pow->getArgOperand(0), *Expo = Pow->getArgOperand(1);		Value Root, Base = Pow->getArgOperand(0), *Expo = Pow->getArgOperand(1);
		AttributeList Attrs = Pow->getCalledFunction()->getAttributes();
		efriedmaUnsubmitted Done Reply Inline Actions You need nsz: `pow(-0., 1./3)` returns +0, but `cbrt(-0.)` returns -0. I think I'd prefer to require afn for this; not sure it's necessary, but better to be safe. Please add explicit comments explaining why you need nnan and ninf (nnan because pow() returns a nan for negative x, ninf for `pow(-inf, 1./3)`). efriedma: You need nsz: `pow(-0., 1./3)` returns +0, but `cbrt(-0.)` returns -0. I think I'd prefer to…
		efriedmaUnsubmitted Done Reply Inline Actions isFast() is deprecated, because it makes the actual requirements unclear and disables optimizations where it isn't necessary. (In particular, you don't need reassoc here.) efriedma: isFast() is deprecated, because it makes the actual requirements unclear and disables…
Module *Mod = Pow->getModule();		Module *Mod = Pow->getModule();
Type *Ty = Pow->getType();		Type *Ty = Pow->getType();

const APFloat *ExpoF;		const APFloat *ExpoF;
if (!match(Expo, m_APFloat(ExpoF)) \|\|		if (!match(Expo, m_APFloat(ExpoF)))
(!ExpoF->isExactlyValue(0.5) && !ExpoF->isExactlyValue(-0.5)))
return nullptr;		return nullptr;

		const double OneThird = (Ty->getTypeID() == Type::FloatTyID)
		? (1.0f / 3.0f) : (1.0 / 3.0);
		bool isHalf (ExpoF->isExactlyValue(0.5) \|\| ExpoF->isExactlyValue(-0.5)),
		isThird (ExpoF->isExactlyValue(OneThird) \|\|
		ExpoF->isExactlyValue(-OneThird));
		if (!isHalf && !isThird)
		return nullptr;
		lebedev.riUnsubmitted Done Reply Inline Actions There seems to be two variants: https://godbolt.org/g/Rw1Gxt Can you output the value that doesn't match? lebedev.ri: There seems to be two variants: https://godbolt.org/g/Rw1Gxt Can you output the value that…
		evandroAuthorUnsubmitted Done Reply Inline Actions Please, see any example using `float` in the test case below. evandro: Please, see any example using `float` in the test case below.
		evandroAuthorUnsubmitted Not Done Reply Inline Actions My bad. I crafted the test case using the IEEE754 bits for SP instead of the bits for DP truncated for `float`. evandro: My bad. I crafted the test case using the IEEE754 bits for SP instead of the bits for DP…

		// Expand pow(x, +/-0.5) to sqrt().
		if (isHalf) {
// If errno is never set, then use the intrinsic for sqrt().		// If errno is never set, then use the intrinsic for sqrt().
if (Pow->hasFnAttr(Attribute::ReadNone)) {		if (Pow->hasFnAttr(Attribute::ReadNone)) {
Function *SqrtFn = Intrinsic::getDeclaration(Pow->getModule(),		Function *SqrtFn = Intrinsic::getDeclaration(Mod, Intrinsic::sqrt, Ty);
Intrinsic::sqrt, Ty);		Root = B.CreateCall(SqrtFn, Base, "sqrt");
Sqrt = B.CreateCall(SqrtFn, Base, "sqrt");
}		}
// Otherwise, use the libcall for sqrt().		// Otherwise, use the libcall for sqrt().
else if (hasUnaryFloatFn(TLI, Ty, LibFunc_sqrt, LibFunc_sqrtf, LibFunc_sqrtl))		else if (hasUnaryFloatFn(TLI, Ty,
		LibFunc_sqrt, LibFunc_sqrtf, LibFunc_sqrtl))
// TODO: We also should check that the target can in fact lower the sqrt()		// TODO: We also should check that the target can in fact lower the sqrt()
// libcall. We currently have no way to ask this question, so we ask if		// libcall. We currently have no way to ask this question, so we ask if
// the target has a sqrt() libcall, which is not exactly the same.		// the target has a sqrt() libcall, which is not exactly the same.
Sqrt = emitUnaryFloatFnCall(Base, TLI->getName(LibFunc_sqrt), B,		Root = emitUnaryFloatFnCall(Base, TLI->getName(LibFunc_sqrt), B, Attrs);
Pow->getCalledFunction()->getAttributes());
else		else
return nullptr;		return nullptr;

// Handle signed zero base by expanding to fabs(sqrt(x)).		// Handle signed zero base by expanding to fabs(sqrt(x)).
if (!Pow->hasNoSignedZeros()) {		if (!Pow->hasNoSignedZeros()) {
Function *FAbsFn = Intrinsic::getDeclaration(Mod, Intrinsic::fabs, Ty);		Function *FAbsFn = Intrinsic::getDeclaration(Mod, Intrinsic::fabs, Ty);
Sqrt = B.CreateCall(FAbsFn, Sqrt, "abs");		Root = B.CreateCall(FAbsFn, Root, "abs");
}		}

// Handle non finite base by expanding to		// Handle non finite base by expanding to
// (x == -infinity ? +infinity : sqrt(x)).		// (x == -infinity ? +infinity : sqrt(x)).
if (!Pow->hasNoInfs()) {		if (!Pow->hasNoInfs()) {
Value *PosInf = ConstantFP::getInfinity(Ty),		Value *PosInf = ConstantFP::getInfinity(Ty),
*NegInf = ConstantFP::getInfinity(Ty, true);		*NegInf = ConstantFP::getInfinity(Ty, true);
Value *FCmp = B.CreateFCmpOEQ(Base, NegInf, "iseq");		Value *FCmp = B.CreateFCmpOEQ(Base, NegInf, "iseq");
Sqrt = B.CreateSelect(FCmp, PosInf, Sqrt);		Root = B.CreateSelect(FCmp, PosInf, Root);
		}
		}
		// Expand pow(x, +/-0.333...) to cbrt(), but only for a regular base.
		else if (isThird &&
		Pow->hasNoInfs() && Pow->hasNoNaNs() && Pow->hasNoSignedZeros()) {
		if (hasUnaryFloatFn(TLI, Ty, LibFunc_cbrt, LibFunc_cbrtf, LibFunc_cbrtl))
		Root = emitUnaryFloatFnCall(Base, TLI->getName(LibFunc_cbrt), B, Attrs);
		else
		return nullptr;
}		}
		else
		return nullptr;

// If the exponent is negative, then get the reciprocal.		// If the exponent is negative, then get the reciprocal.
if (ExpoF->isNegative())		if (ExpoF->isNegative())
Sqrt = B.CreateFDiv(ConstantFP::get(Ty, 1.0), Sqrt, "reciprocal");		Root = B.CreateFDiv(ConstantFP::get(Ty, 1.0), Root, "reciprocal");

return Sqrt;		return Root;
}		}

Value LibCallSimplifier::optimizePow(CallInst Pow, IRBuilder<> &B) {		Value LibCallSimplifier::optimizePow(CallInst Pow, IRBuilder<> &B) {
Value Base = Pow->getArgOperand(0), Expo = Pow->getArgOperand(1);		Value Base = Pow->getArgOperand(0), Expo = Pow->getArgOperand(1);
Function *Callee = Pow->getCalledFunction();		Function *Callee = Pow->getCalledFunction();
AttributeList Attrs = Callee->getAttributes();		AttributeList Attrs = Callee->getAttributes();
StringRef Name = Callee->getName();		StringRef Name = Callee->getName();
Module *Module = Pow->getModule();		Module *Module = Pow->getModule();
▲ Show 20 Lines • Show All 64 Lines • ▼ Show 20 Lines	Value LibCallSimplifier::optimizePow(CallInst Pow, IRBuilder<> &B) {
// pow(x, 1.0) -> x		// pow(x, 1.0) -> x
if (match(Expo, m_FPOne()))		if (match(Expo, m_FPOne()))
return Base;		return Base;

// pow(x, 2.0) -> x * x		// pow(x, 2.0) -> x * x
if (match(Expo, m_SpecificFP(2.0)))		if (match(Expo, m_SpecificFP(2.0)))
return B.CreateFMul(Base, Base, "square");		return B.CreateFMul(Base, Base, "square");

if (Value *Sqrt = replacePowWithSqrt(Pow, B))		if (Value *Root = replacePowWithRoot(Pow, B))
return Sqrt;		return Root;

// pow(x, n) -> x * x * x * ...		// pow(x, n) -> x * x * x * ...
const APFloat *ExpoF;		const APFloat *ExpoF;
if (Pow->isFast() && match(Expo, m_APFloat(ExpoF))) {		if (Pow->isFast() && match(Expo, m_APFloat(ExpoF))) {
// We limit to a max of 7 multiplications, thus the maximum exponent is 32.		// We limit to a max of 7 multiplications, thus the maximum exponent is 32.
APFloat LimF(ExpoF->getSemantics(), 33.0),		APFloat LimF(ExpoF->getSemantics(), 33.0),
ExpoA(abs(*ExpoF));		ExpoA(abs(*ExpoF));
if (ExpoA.isInteger() && ExpoA.compare(LimF) == APFloat::cmpLessThan) {		if (ExpoA.isInteger() && ExpoA.compare(LimF) == APFloat::cmpLessThan) {
▲ Show 20 Lines • Show All 1,445 Lines • Show Last 20 Lines

llvm/test/Transforms/InstCombine/pow-cbrt.ll

	; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
	; RUN: opt < %s -instcombine -S \| FileCheck %s			; RUN: opt < %s -instcombine -S \| FileCheck %s
				lebedev.riUnsubmitted Done Reply Inline Actions Just use `./utils/update_test_checks.py` lebedev.ri: Just use `./utils/update_test_checks.py`

	define double @pow_intrinsic_third_fast(double %x) {			define double @pow_intrinsic_third_fast(double %x) {
	; CHECK-LABEL: @pow_intrinsic_third_fast(			; CHECK-LABEL: @pow_intrinsic_third_fast(
	; CHECK-NEXT: [[POW:%.]] = call fast double @llvm.pow.f64(double [[X:%.]], double 0x3FD5555555555555)			; CHECK-NEXT: [[CBRT:%.*]] = call fast double @cbrt(double %x) #1
	; CHECK-NEXT: ret double [[POW]]			; CHECK-NEXT: ret double [[CBRT]]
	;			;
	%pow = call fast double @llvm.pow.f64(double %x, double 0x3fd5555555555555)			%pow = call fast double @llvm.pow.f64(double %x, double 0x3fd5555555555555)
	ret double %pow			ret double %pow
	}			}

	define float @powf_intrinsic_third_fast(float %x) {			define float @powf_intrinsic_third_fast(float %x) {
	; CHECK-LABEL: @powf_intrinsic_third_fast(			; CHECK-LABEL: @powf_intrinsic_third_fast(
	; CHECK-NEXT: [[POW:%.]] = call fast float @llvm.pow.f32(float [[X:%.]], float 0x3FD5555560000000)			; CHECK-NEXT: [[CBRTF:%.*]] = call fast float @cbrtf(float %x) #1
	; CHECK-NEXT: ret float [[POW]]			; CHECK-NEXT: ret float [[CBRTF]]
	;			;
	%pow = call fast float @llvm.pow.f32(float %x, float 0x3fd5555560000000)			%pow = call fast float @llvm.pow.f32(float %x, float 0x3fd5555560000000)
	ret float %pow			ret float %pow
	}			}

	define double @pow_intrinsic_third_approx(double %x) {			define double @pow_intrinsic_third_approx(double %x) {
	; CHECK-LABEL: @pow_intrinsic_third_approx(			; CHECK-LABEL: @pow_intrinsic_third_approx(
	; CHECK-NEXT: [[POW:%.]] = call afn double @llvm.pow.f64(double [[X:%.]], double 0x3FD5555555555555)			; CHECK-NEXT: [[POW:%.*]] = call afn double @llvm.pow.f64(double %x, double 0x3FD5555555555555)
	; CHECK-NEXT: ret double [[POW]]			; CHECK-NEXT: ret double [[POW]]
	;			;
	%pow = call afn double @llvm.pow.f64(double %x, double 0x3fd5555555555555)			%pow = call afn double @llvm.pow.f64(double %x, double 0x3fd5555555555555)
	ret double %pow			ret double %pow
	}			}

	define float @powf_intrinsic_third_approx(float %x) {			define float @powf_intrinsic_third_approx(float %x) {
	; CHECK-LABEL: @powf_intrinsic_third_approx(			; CHECK-LABEL: @powf_intrinsic_third_approx(
	; CHECK-NEXT: [[POW:%.]] = call afn float @llvm.pow.f32(float [[X:%.]], float 0x3FD5555560000000)			; CHECK-NEXT: [[POW:%.*]] = call afn float @llvm.pow.f32(float %x, float 0x3FD5555560000000)
	; CHECK-NEXT: ret float [[POW]]			; CHECK-NEXT: ret float [[POW]]
	;			;
	%pow = call afn float @llvm.pow.f32(float %x, float 0x3fd5555560000000)			%pow = call afn float @llvm.pow.f32(float %x, float 0x3fd5555560000000)
	ret float %pow			ret float %pow
	}			}

	define double @pow_libcall_third_fast(double %x) {			define double @pow_libcall_third_fast(double %x) {
	; CHECK-LABEL: @pow_libcall_third_fast(			; CHECK-LABEL: @pow_libcall_third_fast(
	; CHECK-NEXT: [[POW:%.]] = call fast double @pow(double [[X:%.]], double 0x3FD5555555555555)			; CHECK-NEXT: [[CBRT:%.*]] = call fast double @cbrt(double %x)
	; CHECK-NEXT: ret double [[POW]]			; CHECK-NEXT: ret double [[CBRT]]
	;			;
	%pow = call fast double @pow(double %x, double 0x3fd5555555555555)			%pow = call fast double @pow(double %x, double 0x3fd5555555555555)
	ret double %pow			ret double %pow
	}			}

	define float @powf_libcall_third_fast(float %x) {			define float @powf_libcall_third_fast(float %x) {
	; CHECK-LABEL: @powf_libcall_third_fast(			; CHECK-LABEL: @powf_libcall_third_fast(
	; CHECK-NEXT: [[POW:%.]] = call fast float @powf(float [[X:%.]], float 0x3FD5555560000000)			; CHECK-NEXT: [[CBRTF:%.*]] = call fast float @cbrtf(float %x)
	; CHECK-NEXT: ret float [[POW]]			; CHECK-NEXT: ret float [[CBRTF]]
	;			;
	%pow = call fast float @powf(float %x, float 0x3fd5555560000000)			%pow = call fast float @powf(float %x, float 0x3fd5555560000000)
	ret float %pow			ret float %pow
	}			}

	define double @pow_intrinsic_negthird_fast(double %x) {			define double @pow_intrinsic_negthird_fast(double %x) {
	; CHECK-LABEL: @pow_intrinsic_negthird_fast(			; CHECK-LABEL: @pow_intrinsic_negthird_fast(
	; CHECK-NEXT: [[POW:%.]] = call fast double @llvm.pow.f64(double [[X:%.]], double 0xBFD5555555555555)			; CHECK-NEXT: [[CBRT:%.*]] = call fast double @cbrt(double %x) #1
	; CHECK-NEXT: ret double [[POW]]			; CHECK-NEXT: [[RECP:%.*]] = fdiv fast double 1.000000e+00, [[CBRT]]
				; CHECK-NEXT: ret double [[RECP]]
	;			;
	%pow = call fast double @llvm.pow.f64(double %x, double 0xbfd5555555555555)			%pow = call fast double @llvm.pow.f64(double %x, double 0xbfd5555555555555)
	ret double %pow			ret double %pow
	}			}

	define float @powf_intrinsic_negthird_fast(float %x) {			define float @powf_intrinsic_negthird_fast(float %x) {
	; CHECK-LABEL: @powf_intrinsic_negthird_fast(			; CHECK-LABEL: @powf_intrinsic_negthird_fast(
	; CHECK-NEXT: [[POW:%.]] = call fast float @llvm.pow.f32(float [[X:%.]], float 0xBFD5555560000000)			; CHECK-NEXT: [[CBRTF:%.*]] = call fast float @cbrtf(float %x) #1
	; CHECK-NEXT: ret float [[POW]]			; CHECK-NEXT: [[RECP:%.*]] = fdiv fast float 1.000000e+00, [[CBRTF]]
				; CHECK-NEXT: ret float [[RECP]]
	;			;
	%pow = call fast float @llvm.pow.f32(float %x, float 0xbfd5555560000000)			%pow = call fast float @llvm.pow.f32(float %x, float 0xbfd5555560000000)
	ret float %pow			ret float %pow
	}			}

	define double @pow_intrinsic_negthird_approx(double %x) {			define double @pow_intrinsic_negthird_approx(double %x) {
	; CHECK-LABEL: @pow_intrinsic_negthird_approx(			; CHECK-LABEL: @pow_intrinsic_negthird_approx(
	; CHECK-NEXT: [[POW:%.]] = call afn double @llvm.pow.f64(double [[X:%.]], double 0xBFD5555555555555)			; CHECK-NEXT: [[POW:%.*]] = call afn double @llvm.pow.f64(double %x, double 0xBFD5555555555555)
	; CHECK-NEXT: ret double [[POW]]			; CHECK-NEXT: ret double [[POW]]
	;			;
	%pow = call afn double @llvm.pow.f64(double %x, double 0xbfd5555555555555)			%pow = call afn double @llvm.pow.f64(double %x, double 0xbfd5555555555555)
	ret double %pow			ret double %pow
	}			}

	define float @powf_intrinsic_negthird_approx(float %x) {			define float @powf_intrinsic_negthird_approx(float %x) {
	; CHECK-LABEL: @powf_intrinsic_negthird_approx(			; CHECK-LABEL: @powf_intrinsic_negthird_approx(
	; CHECK-NEXT: [[POW:%.]] = call afn float @llvm.pow.f32(float [[X:%.]], float 0xBFD5555560000000)			; CHECK-NEXT: [[POW:%.*]] = call afn float @llvm.pow.f32(float %x, float 0xBFD5555560000000)
	; CHECK-NEXT: ret float [[POW]]			; CHECK-NEXT: ret float [[POW]]
	;			;
	%pow = call afn float @llvm.pow.f32(float %x, float 0xbfd5555560000000)			%pow = call afn float @llvm.pow.f32(float %x, float 0xbfd5555560000000)
	ret float %pow			ret float %pow
	}			}

	define double @pow_libcall_negthird_fast(double %x) {			define double @pow_libcall_negthird_fast(double %x) {
	; CHECK-LABEL: @pow_libcall_negthird_fast(			; CHECK-LABEL: @pow_libcall_negthird_fast(
	; CHECK-NEXT: [[POW:%.]] = call fast double @pow(double [[X:%.]], double 0xBFD5555555555555)			; CHECK-NEXT: [[CBRT:%.*]] = call fast double @cbrt(double %x)
	; CHECK-NEXT: ret double [[POW]]			; CHECK-NEXT: [[RECP:%.*]] = fdiv fast double 1.000000e+00, [[CBRT]]
				; CHECK-NEXT: ret double [[RECP]]
	;			;
	%pow = call fast double @pow(double %x, double 0xbfd5555555555555)			%pow = call fast double @pow(double %x, double 0xbfd5555555555555)
	ret double %pow			ret double %pow
	}			}

	define float @powf_libcall_negthird_fast(float %x) {			define float @powf_libcall_negthird_fast(float %x) {
	; CHECK-LABEL: @powf_libcall_negthird_fast(			; CHECK-LABEL: @powf_libcall_negthird_fast(
	; CHECK-NEXT: [[POW:%.]] = call fast float @powf(float [[X:%.]], float 0xBFD5555560000000)			; CHECK-NEXT: [[CBRTF:%.*]] = call fast float @cbrtf(float %x)
	; CHECK-NEXT: ret float [[POW]]			; CHECK-NEXT: [[RECP:%.*]] = fdiv fast float 1.000000e+00, [[CBRTF]]
				; CHECK-NEXT: ret float [[RECP]]
	;			;
	%pow = call fast float @powf(float %x, float 0xbfd5555560000000)			%pow = call fast float @powf(float %x, float 0xbfd5555560000000)
	ret float %pow			ret float %pow
	}			}

	declare double @llvm.pow.f64(double, double) #0			declare double @llvm.pow.f64(double, double) #0
	declare float @llvm.pow.f32(float, float) #0			declare float @llvm.pow.f32(float, float) #0
	declare double @pow(double, double)			declare double @pow(double, double)
	declare float @powf(float, float)			declare float @powf(float, float)

	attributes #0 = { nounwind readnone speculatable }			attributes #0 = { nounwind readnone speculatable }

This is an archive of the discontinued LLVM Phabricator instance.

[SLC] Simplify pow(x, 0.333...) to cbrt(x)
AbandonedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 158438

llvm/include/llvm/Transforms/Utils/SimplifyLibCalls.h

llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp

llvm/test/Transforms/InstCombine/pow-cbrt.ll

This is an archive of the discontinued LLVM Phabricator instance.

[SLC] Simplify pow(x, 0.333...) to cbrt(x)AbandonedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 158438

llvm/include/llvm/Transforms/Utils/SimplifyLibCalls.h

llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp

llvm/test/Transforms/InstCombine/pow-cbrt.ll

[SLC] Simplify pow(x, 0.333...) to cbrt(x)
AbandonedPublic