This is an archive of the discontinued LLVM Phabricator instance.

[SLC] Allow llvm.pow(2**n,x) -> llvm.exp2(n*x) even if no exp2 lib func
AbandonedPublic

Authored by foad on May 4 2020, 3:56 AM.

Download Raw Diff

Details

Reviewers

evandro
efriedma
spatel
lebedev.ri

Summary

replacePowWithExp generates a call either to the llvm.exp2 intrinsic or
to the exp2 library function. Allow it to use the intrinsic even if the
library function is not available.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

foad created this revision.May 4 2020, 3:56 AM

Herald added a project: Restricted Project. · View Herald TranscriptMay 4 2020, 3:56 AM

Herald added subscribers: llvm-commits, hiraditya. · View Herald Transcript

lebedev.ri added inline comments.May 4 2020, 4:27 AM

llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp
1551–1553	Was this previously reachable?

Harbormaster completed remote builds in B55618: Diff 261777.May 4 2020, 5:50 AM

foad marked an inline comment as done.May 4 2020, 5:57 AM

foad added inline comments.

llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp
1551–1553	Yes whenever we try to optimize a call to the llvm.pow intrinsic on a target that also has the pow libfunc available.

arsenm added a subscriber: arsenm.May 4 2020, 6:45 AM

arsenm added inline comments.

llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp
1537–1538	I've always hated the way all of this code conflates library functions and intrinsics as if they're interchangeable.

When we're compiling C code, we can't blindly generate floating-point intrinsics. Even if we're confident the transform is correct, the backend has limited capabilities. If the corresponding libfunc doesn't exist, and the target doesn't have any special capabilities, the backed is stuck: we need to generate a function call, but the function doesn't exist.

We could potentially add some TTI query to ask about the intrinsic, specifically, as opposed to the library function. But we can't just bypass the check.

LGTM

llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp
1551	This can probably be cached into a variable at the top of the function.

This revision is now accepted and ready to land.May 4 2020, 3:09 PM

efriedma requested changes to this revision.May 4 2020, 3:44 PM

This revision now requires changes to proceed.May 4 2020, 3:44 PM

In D79321#2018473, @efriedma wrote:

When we're compiling C code, we can't blindly generate floating-point intrinsics. Even if we're confident the transform is correct, the backend has limited capabilities. If the corresponding libfunc doesn't exist, and the target doesn't have any special capabilities, the backed is stuck: we need to generate a function call, but the function doesn't exist.

We could potentially add some TTI query to ask about the intrinsic, specifically, as opposed to the library function. But we can't just bypass the check.

For background, I'm interested in the AMDGPU target which has no libfuncs at all, but certainly does support intrinsics like llvm.sin/cos/exp/log.f32 because there are hardware instructions for them.

My understanding was that generic (llvm.*) intrinsics were unconditionally supported by all targets, because they're part of the IR almost as much as Instructions are, and it's CodeGen's job to implement them one way or another.

But you're saying that this is not true, and furthermore there is no good way to query whether a particular intrinsic is supported by a particular target? Is there any way forward, short of implementing a new TTI query mechanism like you suggested?

The "intrinsic" part of an intrinsic just means that the call has special semantics. Some intrinsics can't be implemented by certain targets, or don't make sense. And some intrinsics which make sense on a target in theory might not actually be implemented. This is actually true for instructions as well, although the cases where this comes up are more rare.

Fundamentally, the issue is that when compiling C, LLVM has limited control over the link and runtime environment. If an environment requires linking against a libc that doesn't have exp2, we don't have any control over that, and can't really do anything about it, unless we want to teach the compiler to emit an implementation inline.

We currently don't have any uniform way of dealing with this at the moment; if you're interested in adding one, it would make sense.

foad abandoned this revision.May 5 2020, 2:18 PM

Revision Contents

Path

Size

llvm/

lib/

Transforms/

Utils/

SimplifyLibCalls.cpp

3 lines

test/

Transforms/

InstCombine/

pow-1.ll

40 lines

Diff 261777

llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp

Show First 20 Lines • Show All 1,528 Lines • ▼ Show 20 Lines	if (match(Base, m_SpecificFP(2.0)) &&
hasFloatFn(TLI, Ty, LibFunc_ldexp, LibFunc_ldexpf, LibFunc_ldexpl)) {		hasFloatFn(TLI, Ty, LibFunc_ldexp, LibFunc_ldexpf, LibFunc_ldexpl)) {
if (Value *ExpoI = getIntToFPVal(Expo, B))		if (Value *ExpoI = getIntToFPVal(Expo, B))
return emitBinaryFloatFnCall(ConstantFP::get(Ty, 1.0), ExpoI, TLI,		return emitBinaryFloatFnCall(ConstantFP::get(Ty, 1.0), ExpoI, TLI,
LibFunc_ldexp, LibFunc_ldexpf, LibFunc_ldexpl,		LibFunc_ldexp, LibFunc_ldexpf, LibFunc_ldexpl,
B, Attrs);		B, Attrs);
}		}

// pow(2.0 ** n, x) -> exp2(n * x)		// pow(2.0 ** n, x) -> exp2(n * x)
if (hasFloatFn(TLI, Ty, LibFunc_exp2, LibFunc_exp2f, LibFunc_exp2l)) {		if (Pow->doesNotAccessMemory() \|\|
		hasFloatFn(TLI, Ty, LibFunc_exp2, LibFunc_exp2f, LibFunc_exp2l)) {
		arsenmUnsubmitted Done Reply Inline Actions I've always hated the way all of this code conflates library functions and intrinsics as if they're interchangeable. arsenm: I've always hated the way all of this code conflates library functions and intrinsics as if…
APFloat BaseR = APFloat(1.0);		APFloat BaseR = APFloat(1.0);
BaseR.convert(BaseF->getSemantics(), APFloat::rmTowardZero, &Ignored);		BaseR.convert(BaseF->getSemantics(), APFloat::rmTowardZero, &Ignored);
BaseR = BaseR / *BaseF;		BaseR = BaseR / *BaseF;
bool IsInteger = BaseF->isInteger(), IsReciprocal = BaseR.isInteger();		bool IsInteger = BaseF->isInteger(), IsReciprocal = BaseR.isInteger();
const APFloat *NF = IsReciprocal ? &BaseR : BaseF;		const APFloat *NF = IsReciprocal ? &BaseR : BaseF;
APSInt NI(64, false);		APSInt NI(64, false);
if ((IsInteger \|\| IsReciprocal) &&		if ((IsInteger \|\| IsReciprocal) &&
NF->convertToInteger(NI, APFloat::rmTowardZero, &Ignored) ==		NF->convertToInteger(NI, APFloat::rmTowardZero, &Ignored) ==
APFloat::opOK &&		APFloat::opOK &&
NI > 1 && NI.isPowerOf2()) {		NI > 1 && NI.isPowerOf2()) {
double N = NI.logBase2() * (IsReciprocal ? -1.0 : 1.0);		double N = NI.logBase2() * (IsReciprocal ? -1.0 : 1.0);
Value *FMul = B.CreateFMul(Expo, ConstantFP::get(Ty, N), "mul");		Value *FMul = B.CreateFMul(Expo, ConstantFP::get(Ty, N), "mul");
if (Pow->doesNotAccessMemory())		if (Pow->doesNotAccessMemory())
		evandroUnsubmitted Not Done Reply Inline Actions This can probably be cached into a variable at the top of the function. evandro: This can probably be cached into a variable at the top of the function.
return B.CreateCall(Intrinsic::getDeclaration(Mod, Intrinsic::exp2, Ty),		return B.CreateCall(Intrinsic::getDeclaration(Mod, Intrinsic::exp2, Ty),
FMul, "exp2");		FMul, "exp2");
		lebedev.riUnsubmitted Done Reply Inline Actions Was this previously reachable? lebedev.ri: Was this previously reachable?
		foadAuthorUnsubmitted Done Reply Inline Actions Yes whenever we try to optimize a call to the llvm.pow intrinsic on a target that also has the pow libfunc available. foad: Yes whenever we try to optimize a call to the llvm.pow intrinsic on a target that also has the…
else		else
return emitUnaryFloatFnCall(FMul, TLI, LibFunc_exp2, LibFunc_exp2f,		return emitUnaryFloatFnCall(FMul, TLI, LibFunc_exp2, LibFunc_exp2f,
LibFunc_exp2l, B, Attrs);		LibFunc_exp2l, B, Attrs);
}		}
}		}

// pow(10.0, x) -> exp10(x)		// pow(10.0, x) -> exp10(x)
// TODO: There is no exp10() intrinsic yet, but some day there shall be one.		// TODO: There is no exp10() intrinsic yet, but some day there shall be one.
▲ Show 20 Lines • Show All 1,912 Lines • Show Last 20 Lines

llvm/test/Transforms/InstCombine/pow-1.ll

	Show First 20 Lines • Show All 101 Lines • ▼ Show 20 Lines
	; NOLIB-NEXT: ret double [[POW]]			; NOLIB-NEXT: ret double [[POW]]
	;			;
	%retval = call double @pow(double 0.25, double %x)			%retval = call double @pow(double 0.25, double %x)
	ret double %retval			ret double %retval
	}			}

	define <2 x float> @test_simplify3v(<2 x float> %x) {			define <2 x float> @test_simplify3v(<2 x float> %x) {
	; CHECK-LABEL: @test_simplify3v(			; CHECK-LABEL: @test_simplify3v(
	; ANY-NEXT: [[EXP2:%.]] = call <2 x float> @llvm.exp2.v2f32(<2 x float> [[X:%.]])			; CHECK-NEXT: [[EXP2:%.]] = call <2 x float> @llvm.exp2.v2f32(<2 x float> [[X:%.]])
	; ANY-NEXT: ret <2 x float> [[EXP2]]			; CHECK-NEXT: ret <2 x float> [[EXP2]]
	; MSVC-NEXT: [[POW:%.]] = call <2 x float> @llvm.pow.v2f32(<2 x float> <float 2.000000e+00, float 2.000000e+00>, <2 x float> [[X:%.]])
	; MSVC-NEXT: ret <2 x float> [[POW]]
	; TODO: should be able to simplify llvm.pow to llvm.exp2 even without libcalls
	; NOLIB-NEXT: [[POW:%.]] = call <2 x float> @llvm.pow.v2f32(<2 x float> <float 2.000000e+00, float 2.000000e+00>, <2 x float> [[X:%.]])
	; NOLIB-NEXT: ret <2 x float> [[POW]]
	;			;
	%retval = call <2 x float> @llvm.pow.v2f32(<2 x float> <float 2.0, float 2.0>, <2 x float> %x)			%retval = call <2 x float> @llvm.pow.v2f32(<2 x float> <float 2.0, float 2.0>, <2 x float> %x)
	ret <2 x float> %retval			ret <2 x float> %retval
	}			}

	define <2 x double> @test_simplify3vn(<2 x double> %x) {			define <2 x double> @test_simplify3vn(<2 x double> %x) {
	; CHECK-LABEL: @test_simplify3vn(			; CHECK-LABEL: @test_simplify3vn(
	; ANY-NEXT: [[MUL:%.]] = fmul <2 x double> [[X:%.]], <double 2.000000e+00, double 2.000000e+00>			; CHECK-NEXT: [[MUL:%.]] = fmul <2 x double> [[X:%.]], <double 2.000000e+00, double 2.000000e+00>
	; ANY-NEXT: [[EXP2:%.*]] = call <2 x double> @llvm.exp2.v2f64(<2 x double> [[MUL]])			; CHECK-NEXT: [[EXP2:%.*]] = call <2 x double> @llvm.exp2.v2f64(<2 x double> [[MUL]])
	; ANY-NEXT: ret <2 x double> [[EXP2]]			; CHECK-NEXT: ret <2 x double> [[EXP2]]
	; MSVC-NEXT: [[POW:%.]] = call <2 x double> @llvm.pow.v2f64(<2 x double> <double 4.000000e+00, double 4.000000e+00>, <2 x double> [[X:%.]])
	; MSVC-NEXT: ret <2 x double> [[POW]]
	; TODO: should be able to simplify llvm.pow to llvm.exp2 even without libcalls
	; NOLIB-NEXT: [[POW:%.]] = call <2 x double> @llvm.pow.v2f64(<2 x double> <double 4.000000e+00, double 4.000000e+00>, <2 x double> [[X:%.]])
	; NOLIB-NEXT: ret <2 x double> [[POW]]
	;			;
	%retval = call <2 x double> @llvm.pow.v2f64(<2 x double> <double 4.0, double 4.0>, <2 x double> %x)			%retval = call <2 x double> @llvm.pow.v2f64(<2 x double> <double 4.0, double 4.0>, <2 x double> %x)
	ret <2 x double> %retval			ret <2 x double> %retval
	}			}

	define double @test_simplify4(double %x) {			define double @test_simplify4(double %x) {
	; CHECK-LABEL: @test_simplify4(			; CHECK-LABEL: @test_simplify4(
	; ANY-NEXT: [[EXP2:%.]] = call double @exp2(double [[X:%.]])			; ANY-NEXT: [[EXP2:%.]] = call double @exp2(double [[X:%.]])
	Show All 29 Lines
	; NOLIB-NEXT: ret float [[POW]]			; NOLIB-NEXT: ret float [[POW]]
	;			;
	%retval = call float @powf(float 8.0, float %x)			%retval = call float @powf(float 8.0, float %x)
	ret float %retval			ret float %retval
	}			}

	define <2 x double> @test_simplify4v(<2 x double> %x) {			define <2 x double> @test_simplify4v(<2 x double> %x) {
	; CHECK-LABEL: @test_simplify4v(			; CHECK-LABEL: @test_simplify4v(
	; ANY-NEXT: [[EXP2:%.]] = call <2 x double> @llvm.exp2.v2f64(<2 x double> [[X:%.]])			; CHECK-NEXT: [[EXP2:%.]] = call <2 x double> @llvm.exp2.v2f64(<2 x double> [[X:%.]])
	; ANY-NEXT: ret <2 x double> [[EXP2]]			; CHECK-NEXT: ret <2 x double> [[EXP2]]
	; MSVC-NEXT: [[POW:%.]] = call <2 x double> @llvm.pow.v2f64(<2 x double> <double 2.000000e+00, double 2.000000e+00>, <2 x double> [[X:%.]])
	; MSVC-NEXT: ret <2 x double> [[POW]]
	; TODO: should be able to simplify llvm.pow to llvm.exp2 even without libcalls
	; NOLIB-NEXT: [[POW:%.]] = call <2 x double> @llvm.pow.v2f64(<2 x double> <double 2.000000e+00, double 2.000000e+00>, <2 x double> [[X:%.]])
	; NOLIB-NEXT: ret <2 x double> [[POW]]
	;			;
	%retval = call <2 x double> @llvm.pow.v2f64(<2 x double> <double 2.0, double 2.0>, <2 x double> %x)			%retval = call <2 x double> @llvm.pow.v2f64(<2 x double> <double 2.0, double 2.0>, <2 x double> %x)
	ret <2 x double> %retval			ret <2 x double> %retval
	}			}

	define <2 x float> @test_simplify4vn(<2 x float> %x) {			define <2 x float> @test_simplify4vn(<2 x float> %x) {
	; CHECK-LABEL: @test_simplify4vn(			; CHECK-LABEL: @test_simplify4vn(
	; ANY-NEXT: [[MUL:%.]] = fneg <2 x float> [[X:%.]]			; CHECK-NEXT: [[MUL:%.]] = fneg <2 x float> [[X:%.]]
	; ANY-NEXT: [[EXP2:%.*]] = call <2 x float> @llvm.exp2.v2f32(<2 x float> [[MUL]])			; CHECK-NEXT: [[EXP2:%.*]] = call <2 x float> @llvm.exp2.v2f32(<2 x float> [[MUL]])
	; ANY-NEXT: ret <2 x float> [[EXP2]]			; CHECK-NEXT: ret <2 x float> [[EXP2]]
	; MSVC-NEXT: [[POW:%.]] = call <2 x float> @llvm.pow.v2f32(<2 x float> <float 5.000000e-01, float 5.000000e-01>, <2 x float> [[X:%.]])
	; MSVC-NEXT: ret <2 x float> [[POW]]
	; TODO: should be able to simplify llvm.pow to llvm.exp2 even without libcalls
	; NOLIB-NEXT: [[POW:%.]] = call <2 x float> @llvm.pow.v2f32(<2 x float> <float 5.000000e-01, float 5.000000e-01>, <2 x float> [[X:%.]])
	; NOLIB-NEXT: ret <2 x float> [[POW]]
	;			;
	%retval = call <2 x float> @llvm.pow.v2f32(<2 x float> <float 0.5, float 0.5>, <2 x float> %x)			%retval = call <2 x float> @llvm.pow.v2f32(<2 x float> <float 0.5, float 0.5>, <2 x float> %x)
	ret <2 x float> %retval			ret <2 x float> %retval
	}			}

	; Check pow(x, 0.0) -> 1.0.			; Check pow(x, 0.0) -> 1.0.

	define float @test_simplify5(float %x) {			define float @test_simplify5(float %x) {
	▲ Show 20 Lines • Show All 313 Lines • Show Last 20 Lines