This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
include/llvm/Transforms/Utils/
-
llvm/
-
Transforms/
-
Utils/
-
SimplifyLibCalls.h
-
lib/Transforms/Utils/
-
Transforms/
-
Utils/
2/5
SimplifyLibCalls.cpp
-
test/Transforms/InstCombine/
-
Transforms/
-
InstCombine/
3/3
pow-1.ll
2/3
pow-sqrt.ll
-
win-math.ll

Differential D87877

[InstCombine] Fix errno bug in pow expansion to sqrt
ClosedPublic

Authored by hubert.reinterpretcast on Sep 17 2020, 7:58 PM.

Download Raw Diff

Details

Reviewers

spatel
nemanjai
daltenty

Commits

rG32c9991dab5c: [InstCombine] Fix errno bug in pow expansion to sqrt

Summary

A conversion from pow to sqrt shall not call an errno-setting sqrt with -infinity: the sqrt will set EDOM where the pow call need not.

This patch avoids the erroneous (pun not intended) transformation by applying the restrictions discussed in the thread for https://lists.llvm.org/pipermail/llvm-dev/2020-September/145051.html.

The existing tests are updated (depending on emphasis in the checks for library calls, avoidance of overlap, and overall coverage):

to add ninf, retaining the intended library call,
to use the intrinsic, retaining the use of select, or
to expect the replacement to not occur.

The following is tested:

The pow intrinsic folds to a select instruction to handle -infinity.
The pow library call folds, with ninf, to sqrt without the select instruction associated with handling -infinity.
The pow library call does not fold to sqrt without ninf.

Diff Detail

Repository: rG LLVM Github Monorepo

Unit TestsFailed

	Time	Test
	410 ms	windows > LLVM.Other::change-printer.ll

Event Timeline

hubert.reinterpretcast created this revision.Sep 17 2020, 7:58 PM

Herald added a project: Restricted Project. · View Herald TranscriptSep 17 2020, 7:58 PM

Herald added a subscriber: hiraditya. · View Herald Transcript

hubert.reinterpretcast requested review of this revision.Sep 17 2020, 7:58 PM

Harbormaster completed remote builds in B72123: Diff 292694.Sep 17 2020, 8:31 PM

hubert.reinterpretcast edited the summary of this revision. (Show Details)Sep 18 2020, 11:01 AM

spatel added inline comments.Sep 21 2020, 5:54 AM

llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp

1735–1737

This transform is making this patch more complicated, right?
How about adding an explicit check to avoid it for the single sqrt case (that could be a preliminary NFC cleanup patch if you prefer). Then we just add the more obvious correctness check inside of replacePowWithSqrt():

diff --git a/llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp b/llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp
index 60b7da7e64f..4ce580f47ad 100644
--- a/llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp
+++ b/llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp
@@ -1636,6 +1636,14 @@ Value *LibCallSimplifier::replacePowWithSqrt(CallInst *Pow, IRBuilderBase &B) {
   if (ExpoF->isNegative() && (!Pow->hasApproxFunc() && !Pow->hasAllowReassoc()))
     return nullptr;
 
+  // If we have a pow() library call (accesses memory) and we can't guarantee
+  // that the base is not Inf, give up:
+  // pow(-Inf, 0.5) does not set errno (result is +Inf as shown below), but
+  // sqrt(-Inf) may set errno.
+  if (!Pow->doesNotAccessMemory() && !Pow->hasNoInfs() &&
+      !isKnownNeverInfinity(Base, TLI))
+    return nullptr;
+
   Sqrt = getSqrtCall(Base, Attrs, Pow->doesNotAccessMemory(), Mod, B, TLI);
   if (!Sqrt)
     return nullptr;
@@ -1722,7 +1730,8 @@ Value *LibCallSimplifier::optimizePow(CallInst *Pow, IRBuilderBase &B) {
 
   // pow(x, n) -> x * x * x * ...
   const APFloat *ExpoF;
-  if (AllowApprox && match(Expo, m_APFloat(ExpoF))) {
+  if (AllowApprox && match(Expo, m_APFloat(ExpoF)) &&
+      !ExpoF->isExactlyValue(0.5) && !ExpoF->isExactlyValue(-0.5)) {
     // We limit to a max of 7 multiplications, thus the maximum exponent is 32.
     // If the exponent is an integer+0.5 we generate a call to sqrt and an
     // additional fmul.

llvm/test/Transforms/InstCombine/pow-1.ll

269

I realize this will be a little more work, but it would be better to replicate this test with the additional FMF (and similarly for other tests that are changing the flags and/or libcall/intrinsic).

That way, we'll retain the likely original intent of the test and add coverage for the cases that we want to verify are not miscompiled.

hubert.reinterpretcast added inline comments.Sep 21 2020, 8:15 AM

llvm/test/Transforms/InstCombine/pow-1.ll
269	I'm not convinced the coverage is meaningfully being lost. There is already a fairly exhaustive combination of tests in `pow-sqrt.ll`. The `double` version of this test (with no FMF) appears twice already. I meant what I said when I wrote that the changes are made "depending on emphasis in the checks for library calls, avoidance of overlap, and overall coverage". Following that guideline, I found the specific test changes to be rather deterministic.

spatel added inline comments.Sep 21 2020, 8:27 AM

llvm/test/Transforms/InstCombine/pow-1.ll
269	OK. I agree that there are a lot of tests for this transform (because it's been shown buggy even before the bug you found). If we can organize it better that would be great, but that doesn't need to hold up the fix for a miscompile.
llvm/test/Transforms/InstCombine/pow-sqrt.ll
31	This comment should be more like above: ; The transform to sqrt is not allowed if we risk setting errno due to -INF. Although that raises a question: if the user has allowed an approximation of pow(), do they really expect that errno would be set accurately? Similarly, if they allowed 'reassoc'...

hubert.reinterpretcast marked 2 inline comments as done.Sep 21 2020, 8:35 AM

hubert.reinterpretcast added inline comments.

llvm/test/Transforms/InstCombine/pow-sqrt.ll
31	Given NaN propagation, if the `sqrt` was okay, then the result here should be NaN. That is, I think the question is more "can we just do the transform and omit the select"?

hubert.reinterpretcast added inline comments.Sep 21 2020, 10:20 AM

llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp
1735–1737	I'll look into committing the check as an NFC cleanup. I'll probably also move this transform into its own function (like the earlier transforms) as an NFC cleanup too. That would make the (possibly unintended) different handling of the early exit cases from this transformation (compared to the other ones) more obvious.

hubert.reinterpretcast added inline comments.Sep 21 2020, 8:55 PM

llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp

1735–1737

Splitting this part out is indeed a good idea. The change is not NFC and actually fixes a problem:

LLVM ERROR: Instruction Combining seems stuck in an infinite loop after 100 iterations.

on a case (with opt -instcombine -S -disable-builtin sqrt) like:

; (float)pow((double)(float)x, 0.5)
define float @shrink_pow_libcall_half(float %x) {
  %dx = fpext float %x to double
  %call = call fast double @pow(double %dx, double 0.5)
  %fr = fptrunc double %call to float
  ret float %fr
}

hubert.reinterpretcast mentioned this in D88066: [InstCombine] For pow(x, +/-0.5), stop falling into pow(x, 1.5), etc. case.Sep 21 2020, 9:06 PM

Rebase on top of D88066; add isKnownNeverInfinity condition
Adjust comment for afn case

hubert.reinterpretcast added a parent revision: D88066: [InstCombine] For pow(x, +/-0.5), stop falling into pow(x, 1.5), etc. case.Sep 22 2020, 8:56 AM

hubert.reinterpretcast marked an inline comment as done.Sep 22 2020, 9:17 AM

hubert.reinterpretcast added inline comments.

llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp
1645	@spatel, I've added the `isKnownNeverInfinity` check here as suggested in your comment below. I am not entirely sure whether it is quite effective though. It seems this query may happening "too early". Example case I tried (with `opt -instcombine`): define double @pow_libcall_half_no_FMF_base_ninf(i32 %x) { %conv = sitofp i32 %x to double %pow = call double @pow(double %conv, double 5.0e-01) ret double %pow }
llvm/test/Transforms/InstCombine/pow-sqrt.ll
31	I've updated the comment.

Harbormaster completed remote builds in B72528: Diff 293477.Sep 22 2020, 9:54 AM

hubert.reinterpretcast mentioned this in rG6801950192ff: [InstCombine] For pow(x, +/-0.5), stop falling into pow(x, 1.5), etc. case.Sep 22 2020, 11:23 AM

LGTM

llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp
1645	This is yet another bug independent of this patch: $ opt -instsimplify inf.ll -S define i1 @src(i32 %x) { %conv = sitofp i32 %x to double %r = fcmp oeq double %conv, 0x7FF0000000000000 ret i1 %r } $ opt -instcombine inf.ll -S define i1 @src(i32 %x) { ret i1 false } https://alive2.llvm.org/ce/z/a54zEx So instcombine knows that transform, but ValueTracking/instsimplify do not. I'll put that on my TODO list. If you change your example to use "uitofp" it should work as expected.

This revision is now accepted and ready to land.Sep 22 2020, 12:51 PM

This revision was landed with ongoing or failed builds.Sep 22 2020, 4:00 PM

Closed by commit rG32c9991dab5c: [InstCombine] Fix errno bug in pow expansion to sqrt (authored by hubert.reinterpretcast). · Explain Why

This revision was automatically updated to reflect the committed changes.

hubert.reinterpretcast added a commit: rG32c9991dab5c: [InstCombine] Fix errno bug in pow expansion to sqrt.

I haven't been following this closely, but is there some reason we can't transform powf(x, 0.5) to sqrt(x == -infinity ? qnan : x)?

In D87877#2288974, @efriedma wrote:

I haven't been following this closely, but is there some reason we can't transform powf(x, 0.5) to sqrt(x == -infinity ? qnan : x)?

Hmm, moving the select inwards should work, yes: sqrt(x == -infinity ? +infinity : x).

spatel mentioned this in rG645c53a9d923: [ValueTracking] enhance isKnownNeverInfinity to understand sitofp.Sep 27 2020, 6:03 AM

Revision Contents

Path

Size

llvm/

include/

llvm/

Transforms/

Utils/

SimplifyLibCalls.h

2 lines

lib/

Transforms/

Utils/

SimplifyLibCalls.cpp

32 lines

test/

Transforms/

InstCombine/

pow-1.ll

74 lines

pow-sqrt.ll

65 lines

win-math.ll

2 lines

Diff 292694

llvm/include/llvm/Transforms/Utils/SimplifyLibCalls.h

Show First 20 Lines • Show All 183 Lines • ▼ Show 20 Lines	private:
Value optimizeBCopy(CallInst CI, IRBuilderBase &B);		Value optimizeBCopy(CallInst CI, IRBuilderBase &B);
// Wrapper for all String/Memory Library Call Optimizations		// Wrapper for all String/Memory Library Call Optimizations
Value optimizeStringMemoryLibCall(CallInst CI, IRBuilderBase &B);		Value optimizeStringMemoryLibCall(CallInst CI, IRBuilderBase &B);

// Math Library Optimizations		// Math Library Optimizations
Value optimizeCAbs(CallInst CI, IRBuilderBase &B);		Value optimizeCAbs(CallInst CI, IRBuilderBase &B);
Value optimizePow(CallInst CI, IRBuilderBase &B);		Value optimizePow(CallInst CI, IRBuilderBase &B);
Value replacePowWithExp(CallInst Pow, IRBuilderBase &B);		Value replacePowWithExp(CallInst Pow, IRBuilderBase &B);
Value replacePowWithSqrt(CallInst Pow, IRBuilderBase &B);		std::pair<Value , bool> replacePowWithSqrt(CallInst Pow, IRBuilderBase &B);
Value optimizeExp2(CallInst CI, IRBuilderBase &B);		Value optimizeExp2(CallInst CI, IRBuilderBase &B);
Value optimizeFMinFMax(CallInst CI, IRBuilderBase &B);		Value optimizeFMinFMax(CallInst CI, IRBuilderBase &B);
Value optimizeLog(CallInst CI, IRBuilderBase &B);		Value optimizeLog(CallInst CI, IRBuilderBase &B);
Value optimizeSqrt(CallInst CI, IRBuilderBase &B);		Value optimizeSqrt(CallInst CI, IRBuilderBase &B);
Value optimizeSinCosPi(CallInst CI, IRBuilderBase &B);		Value optimizeSinCosPi(CallInst CI, IRBuilderBase &B);
Value optimizeTan(CallInst CI, IRBuilderBase &B);		Value optimizeTan(CallInst CI, IRBuilderBase &B);
// Wrapper for all floating point library call optimizations		// Wrapper for all floating point library call optimizations
Value optimizeFloatingPointLibCall(CallInst CI, LibFunc Func,		Value optimizeFloatingPointLibCall(CallInst CI, LibFunc Func,
▲ Show 20 Lines • Show All 45 Lines • Show Last 20 Lines

llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp

Show First 20 Lines • Show All 1,614 Lines • ▼ Show 20 Lines	if (hasFloatFn(TLI, V->getType(), LibFunc_sqrt, LibFunc_sqrtf, LibFunc_sqrtl))
// the target has a sqrt() libcall, which is not exactly the same.		// the target has a sqrt() libcall, which is not exactly the same.
return emitUnaryFloatFnCall(V, TLI, LibFunc_sqrt, LibFunc_sqrtf,		return emitUnaryFloatFnCall(V, TLI, LibFunc_sqrt, LibFunc_sqrtf,
LibFunc_sqrtl, B, Attrs);		LibFunc_sqrtl, B, Attrs);

return nullptr;		return nullptr;
}		}

/// Use square root in place of pow(x, +/-0.5).		/// Use square root in place of pow(x, +/-0.5).
Value LibCallSimplifier::replacePowWithSqrt(CallInst Pow, IRBuilderBase &B) {		std::pair<Value *, bool>
		LibCallSimplifier::replacePowWithSqrt(CallInst *Pow, IRBuilderBase &B) {
Value Sqrt, Base = Pow->getArgOperand(0), *Expo = Pow->getArgOperand(1);		Value Sqrt, Base = Pow->getArgOperand(0), *Expo = Pow->getArgOperand(1);
AttributeList Attrs; // Attributes are only meaningful on the original call		AttributeList Attrs; // Attributes are only meaningful on the original call
Module *Mod = Pow->getModule();		Module *Mod = Pow->getModule();
Type *Ty = Pow->getType();		Type *Ty = Pow->getType();

const APFloat *ExpoF;		const APFloat *ExpoF;
if (!match(Expo, m_APFloat(ExpoF)) \|\|		if (!match(Expo, m_APFloat(ExpoF)) \|\|
(!ExpoF->isExactlyValue(0.5) && !ExpoF->isExactlyValue(-0.5)))		(!ExpoF->isExactlyValue(0.5) && !ExpoF->isExactlyValue(-0.5)))
return nullptr;		return {nullptr, false};

// Converting pow(X, -0.5) to 1/sqrt(X) may introduce an extra rounding step,		// Converting pow(X, -0.5) to 1/sqrt(X) may introduce an extra rounding step,
// so that requires fast-math-flags (afn or reassoc).		// so that requires fast-math-flags (afn or reassoc).
if (ExpoF->isNegative() && (!Pow->hasApproxFunc() && !Pow->hasAllowReassoc()))		if (ExpoF->isNegative() && (!Pow->hasApproxFunc() && !Pow->hasAllowReassoc()))
return nullptr;		return {nullptr, true};

Sqrt = getSqrtCall(Base, Attrs, Pow->doesNotAccessMemory(), Mod, B, TLI);		const bool CallDoesNotAccessMemory = Pow->doesNotAccessMemory();
		const bool HasNoInfs = Pow->hasNoInfs();
		// Avoid library call to sqrt function family causing errno = EDOM when
		// x == -infinity.
		if (!(CallDoesNotAccessMemory \|\| HasNoInfs))
		return {nullptr, true};
		hubert.reinterpretcastAuthorUnsubmitted Not Done Reply Inline Actions @spatel, I've added the `isKnownNeverInfinity` check here as suggested in your comment below. I am not entirely sure whether it is quite effective though. It seems this query may happening "too early". Example case I tried (with `opt -instcombine`): define double @pow_libcall_half_no_FMF_base_ninf(i32 %x) { %conv = sitofp i32 %x to double %pow = call double @pow(double %conv, double 5.0e-01) ret double %pow } hubert.reinterpretcast: @spatel, I've added the `isKnownNeverInfinity` check here as suggested in your comment below. I…
		spatelUnsubmitted Not Done Reply Inline Actions This is yet another bug independent of this patch: $ opt -instsimplify inf.ll -S define i1 @src(i32 %x) { %conv = sitofp i32 %x to double %r = fcmp oeq double %conv, 0x7FF0000000000000 ret i1 %r } $ opt -instcombine inf.ll -S define i1 @src(i32 %x) { ret i1 false } https://alive2.llvm.org/ce/z/a54zEx So instcombine knows that transform, but ValueTracking/instsimplify do not. I'll put that on my TODO list. If you change your example to use "uitofp" it should work as expected. spatel: This is yet another bug independent of this patch: ``` $ opt -instsimplify inf.ll -S define i1…

		Sqrt = getSqrtCall(Base, Attrs, CallDoesNotAccessMemory, Mod, B, TLI);
if (!Sqrt)		if (!Sqrt)
return nullptr;		return {nullptr, true};

// Handle signed zero base by expanding to fabs(sqrt(x)).		// Handle signed zero base by expanding to fabs(sqrt(x)).
if (!Pow->hasNoSignedZeros()) {		if (!Pow->hasNoSignedZeros()) {
Function *FAbsFn = Intrinsic::getDeclaration(Mod, Intrinsic::fabs, Ty);		Function *FAbsFn = Intrinsic::getDeclaration(Mod, Intrinsic::fabs, Ty);
Sqrt = B.CreateCall(FAbsFn, Sqrt, "abs");		Sqrt = B.CreateCall(FAbsFn, Sqrt, "abs");
}		}

// Handle non finite base by expanding to		// Handle non finite base by expanding to
// (x == -infinity ? +infinity : sqrt(x)).		// (x == -infinity ? +infinity : sqrt(x)).
if (!Pow->hasNoInfs()) {		if (!HasNoInfs) {
Value *PosInf = ConstantFP::getInfinity(Ty),		Value *PosInf = ConstantFP::getInfinity(Ty),
*NegInf = ConstantFP::getInfinity(Ty, true);		*NegInf = ConstantFP::getInfinity(Ty, true);
Value *FCmp = B.CreateFCmpOEQ(Base, NegInf, "isinf");		Value *FCmp = B.CreateFCmpOEQ(Base, NegInf, "isinf");
Sqrt = B.CreateSelect(FCmp, PosInf, Sqrt);		Sqrt = B.CreateSelect(FCmp, PosInf, Sqrt);
}		}

// If the exponent is negative, then get the reciprocal.		// If the exponent is negative, then get the reciprocal.
if (ExpoF->isNegative())		if (ExpoF->isNegative())
Sqrt = B.CreateFDiv(ConstantFP::get(Ty, 1.0), Sqrt, "reciprocal");		Sqrt = B.CreateFDiv(ConstantFP::get(Ty, 1.0), Sqrt, "reciprocal");

return Sqrt;		return {Sqrt, true};
}		}

static Value createPowWithIntegerExponent(Value Base, Value Expo, Module M,		static Value createPowWithIntegerExponent(Value Base, Value Expo, Module M,
IRBuilderBase &B) {		IRBuilderBase &B) {
Value *Args[] = {Base, Expo};		Value *Args[] = {Base, Expo};
Function *F = Intrinsic::getDeclaration(M, Intrinsic::powi, Base->getType());		Function *F = Intrinsic::getDeclaration(M, Intrinsic::powi, Base->getType());
return B.CreateCall(F, Args);		return B.CreateCall(F, Args);
}		}
▲ Show 20 Lines • Show All 41 Lines • ▼ Show 20 Lines	Value LibCallSimplifier::optimizePow(CallInst Pow, IRBuilderBase &B) {
// pow(x, 1.0) -> x		// pow(x, 1.0) -> x
if (match(Expo, m_FPOne()))		if (match(Expo, m_FPOne()))
return Base;		return Base;

// pow(x, 2.0) -> x * x		// pow(x, 2.0) -> x * x
if (match(Expo, m_SpecificFP(2.0)))		if (match(Expo, m_SpecificFP(2.0)))
return B.CreateFMul(Base, Base, "square");		return B.CreateFMul(Base, Base, "square");

if (Value *Sqrt = replacePowWithSqrt(Pow, B))		bool IsSqrtOrSqrtReciprocal;
return Sqrt;		Value *PowReplacedWithSqrt;
		std::tie(PowReplacedWithSqrt, IsSqrtOrSqrtReciprocal) =
		replacePowWithSqrt(Pow, B);
		if (PowReplacedWithSqrt)
		return PowReplacedWithSqrt;

// pow(x, n) -> x * x * x * ...		// pow(x, n) -> x * x * x * ...
const APFloat *ExpoF;		const APFloat *ExpoF;
if (AllowApprox && match(Expo, m_APFloat(ExpoF))) {		if (!IsSqrtOrSqrtReciprocal && AllowApprox && match(Expo, m_APFloat(ExpoF))) {
		spatelUnsubmitted Done Reply Inline Actions This transform is making this patch more complicated, right? How about adding an explicit check to avoid it for the single sqrt case (that could be a preliminary NFC cleanup patch if you prefer). Then we just add the more obvious correctness check inside of replacePowWithSqrt(): diff --git a/llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp b/llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp index 60b7da7e64f..4ce580f47ad 100644 --- a/llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp +++ b/llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp @@ -1636,6 +1636,14 @@ Value LibCallSimplifier::replacePowWithSqrt(CallInst Pow, IRBuilderBase &B) { if (ExpoF->isNegative() && (!Pow->hasApproxFunc() && !Pow->hasAllowReassoc())) return nullptr; + // If we have a pow() library call (accesses memory) and we can't guarantee + // that the base is not Inf, give up: + // pow(-Inf, 0.5) does not set errno (result is +Inf as shown below), but + // sqrt(-Inf) may set errno. + if (!Pow->doesNotAccessMemory() && !Pow->hasNoInfs() && + !isKnownNeverInfinity(Base, TLI)) + return nullptr; + Sqrt = getSqrtCall(Base, Attrs, Pow->doesNotAccessMemory(), Mod, B, TLI); if (!Sqrt) return nullptr; @@ -1722,7 +1730,8 @@ Value LibCallSimplifier::optimizePow(CallInst Pow, IRBuilderBase &B) { // pow(x, n) -> x * x * x * ... const APFloat ExpoF; - if (AllowApprox && match(Expo, m_APFloat(ExpoF))) { + if (AllowApprox && match(Expo, m_APFloat(ExpoF)) && + !ExpoF->isExactlyValue(0.5) && !ExpoF->isExactlyValue(-0.5)) { // We limit to a max of 7 multiplications, thus the maximum exponent is 32. // If the exponent is an integer+0.5 we generate a call to sqrt and an // additional fmul. spatel:* This transform is making this patch more complicated, right? How about adding an explicit check…
		hubert.reinterpretcastAuthorUnsubmitted Not Done Reply Inline Actions I'll look into committing the check as an NFC cleanup. I'll probably also move this transform into its own function (like the earlier transforms) as an NFC cleanup too. That would make the (possibly unintended) different handling of the early exit cases from this transformation (compared to the other ones) more obvious. hubert.reinterpretcast: I'll look into committing the check as an NFC cleanup. I'll probably also move this transform…
		hubert.reinterpretcastAuthorUnsubmitted Done Reply Inline Actions Splitting this part out is indeed a good idea. The change is not NFC and actually fixes a problem: LLVM ERROR: Instruction Combining seems stuck in an infinite loop after 100 iterations. on a case (with `opt -instcombine -S -disable-builtin sqrt`) like: ; (float)pow((double)(float)x, 0.5) define float @shrink_pow_libcall_half(float %x) { %dx = fpext float %x to double %call = call fast double @pow(double %dx, double 0.5) %fr = fptrunc double %call to float ret float %fr } hubert.reinterpretcast: Splitting this part out is indeed a good idea. The change is not NFC and actually fixes a…
// We limit to a max of 7 multiplications, thus the maximum exponent is 32.		// We limit to a max of 7 multiplications, thus the maximum exponent is 32.
// If the exponent is an integer+0.5 we generate a call to sqrt and an		// If the exponent is an integer+0.5 we generate a call to sqrt and an
// additional fmul.		// additional fmul.
// TODO: This whole transformation should be backend specific (e.g. some		// TODO: This whole transformation should be backend specific (e.g. some
// backends might prefer libcalls or the limit for the exponent might		// backends might prefer libcalls or the limit for the exponent might
// be different) and it should also consider optimizing for size.		// be different) and it should also consider optimizing for size.
APFloat LimF(ExpoF->getSemantics(), 33),		APFloat LimF(ExpoF->getSemantics(), 33),
ExpoA(abs(*ExpoF));		ExpoA(abs(*ExpoF));
▲ Show 20 Lines • Show All 1,780 Lines • Show Last 20 Lines

llvm/test/Transforms/InstCombine/pow-1.ll

	Show All 13 Lines
	; RUN: opt -instcombine -S < %s -mtriple=x86_64-pc-windows-msvc18 \| FileCheck %s --check-prefixes=CHECK,LIB,MSVC,VC64,CHECK-NO-EXP10			; RUN: opt -instcombine -S < %s -mtriple=x86_64-pc-windows-msvc18 \| FileCheck %s --check-prefixes=CHECK,LIB,MSVC,VC64,CHECK-NO-EXP10
	; RUN: opt -instcombine -S < %s -mtriple=x86_64-pc-windows-msvc \| FileCheck %s --check-prefixes=CHECK,LIB,MSVC,VC83,VC19,CHECK-NO-EXP10			; RUN: opt -instcombine -S < %s -mtriple=x86_64-pc-windows-msvc \| FileCheck %s --check-prefixes=CHECK,LIB,MSVC,VC83,VC19,CHECK-NO-EXP10
	; RUN: opt -instcombine -S < %s -mtriple=amdgcn-- \| FileCheck %s --check-prefixes=CHECK,NOLIB,CHECK-NO-EXP10			; RUN: opt -instcombine -S < %s -mtriple=amdgcn-- \| FileCheck %s --check-prefixes=CHECK,NOLIB,CHECK-NO-EXP10

	; NOTE: The readonly attribute on the pow call should be preserved			; NOTE: The readonly attribute on the pow call should be preserved
	; in the cases below where pow is transformed into another function call.			; in the cases below where pow is transformed into another function call.

	declare float @powf(float, float) nounwind readonly			declare float @powf(float, float) nounwind readonly
				declare float @llvm.pow.f32(float, float)
	declare double @pow(double, double) nounwind readonly			declare double @pow(double, double) nounwind readonly
	declare double @llvm.pow.f64(double, double)			declare double @llvm.pow.f64(double, double)
	declare <2 x float> @llvm.pow.v2f32(<2 x float>, <2 x float>) nounwind readonly			declare <2 x float> @llvm.pow.v2f32(<2 x float>, <2 x float>) nounwind readonly
	declare <2 x double> @llvm.pow.v2f64(<2 x double>, <2 x double>) nounwind readonly			declare <2 x double> @llvm.pow.v2f64(<2 x double>, <2 x double>) nounwind readonly

	; Check pow(1.0, x) -> 1.0.			; Check pow(1.0, x) -> 1.0.

	define float @test_simplify1(float %x) {			define float @test_simplify1(float %x) {
	▲ Show 20 Lines • Show All 212 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: ret <2 x double> <double 1.000000e+00, double 1.000000e+00>			; CHECK-NEXT: ret <2 x double> <double 1.000000e+00, double 1.000000e+00>
	;			;
	%retval = call <2 x double> @llvm.pow.v2f64(<2 x double> %x, <2 x double> <double 0.0, double 0.0>)			%retval = call <2 x double> @llvm.pow.v2f64(<2 x double> %x, <2 x double> <double 0.0, double 0.0>)
	ret <2 x double> %retval			ret <2 x double> %retval
	}			}

	; Check pow(x, 0.5) -> fabs(sqrt(x)), where x != -infinity.			; Check pow(x, 0.5) -> fabs(sqrt(x)), where x != -infinity.

	define float @powf_libcall_to_select_sqrt(float %x) {			define float @powf_libcall_half_ninf(float %x) {
	; CHECK-LABEL: @powf_libcall_to_select_sqrt(			; CHECK-LABEL: @powf_libcall_half_ninf(
	; ANY-NEXT: [[SQRTF:%.]] = call float @sqrtf(float [[X:%.]])			; ANY-NEXT: [[SQRTF:%.]] = call ninf float @sqrtf(float [[X:%.]])
	; ANY-NEXT: [[ABS:%.*]] = call float @llvm.fabs.f32(float [[SQRTF]])			; ANY-NEXT: [[ABS:%.*]] = call ninf float @llvm.fabs.f32(float [[SQRTF]])
	; ANY-NEXT: [[ISINF:%.*]] = fcmp oeq float [[X]], 0xFFF0000000000000			; ANY-NEXT: ret float [[ABS]]
	; ANY-NEXT: [[TMP1:%.*]] = select i1 [[ISINF]], float 0x7FF0000000000000, float [[ABS]]			; VC32-NEXT: [[POW:%.]] = call ninf float @powf(float [[X:%.]], float 5.000000e-01)
	; ANY-NEXT: ret float [[TMP1]]
	; VC32-NEXT: [[POW:%.]] = call float @powf(float [[X:%.]], float 5.000000e-01)
	; VC32-NEXT: ret float [[POW]]			; VC32-NEXT: ret float [[POW]]
	; VC51-NEXT: [[POW:%.]] = call float @powf(float [[X:%.]], float 5.000000e-01)			; VC51-NEXT: [[POW:%.]] = call ninf float @powf(float [[X:%.]], float 5.000000e-01)
	; VC51-NEXT: ret float [[POW]]			; VC51-NEXT: ret float [[POW]]
	; VC64-NEXT: [[SQRTF:%.]] = call float @sqrtf(float [[X:%.]])			; VC64-NEXT: [[SQRTF:%.]] = call ninf float @sqrtf(float [[X:%.]])
	; VC64-NEXT: [[ABS:%.*]] = call float @llvm.fabs.f32(float [[SQRTF]])			; VC64-NEXT: [[ABS:%.*]] = call ninf float @llvm.fabs.f32(float [[SQRTF]])
	; VC64-NEXT: [[ISINF:%.*]] = fcmp oeq float [[X]], 0xFFF0000000000000			; VC64-NEXT: ret float [[ABS]]
	; VC64-NEXT: [[TMP1:%.*]] = select i1 [[ISINF]], float 0x7FF0000000000000, float [[ABS]]			; VC83-NEXT: [[SQRTF:%.]] = call ninf float @sqrtf(float [[X:%.]])
	; VC64-NEXT: ret float [[TMP1]]			; VC83-NEXT: [[ABS:%.*]] = call ninf float @llvm.fabs.f32(float [[SQRTF]])
	; VC83-NEXT: [[SQRTF:%.]] = call float @sqrtf(float [[X:%.]])			; VC83-NEXT: ret float [[ABS]]
	; VC83-NEXT: [[ABS:%.*]] = call float @llvm.fabs.f32(float [[SQRTF]])			; NOLIB-NEXT: [[POW:%.]] = call ninf float @powf(float [[X:%.]], float 5.000000e-01)
	; VC83-NEXT: [[ISINF:%.*]] = fcmp oeq float [[X]], 0xFFF0000000000000
	; VC83-NEXT: [[TMP1:%.*]] = select i1 [[ISINF]], float 0x7FF0000000000000, float [[ABS]]
	; VC83-NEXT: ret float [[TMP1]]
	; NOLIB-NEXT: [[POW:%.]] = call float @powf(float [[X:%.]], float 5.000000e-01)
	; NOLIB-NEXT: ret float [[POW]]			; NOLIB-NEXT: ret float [[POW]]
	;			;
	%retval = call float @powf(float %x, float 0.5)			%retval = call ninf float @powf(float %x, float 0.5)
				spatelUnsubmitted Done Reply Inline Actions I realize this will be a little more work, but it would be better to replicate this test with the additional FMF (and similarly for other tests that are changing the flags and/or libcall/intrinsic). That way, we'll retain the likely original intent of the test and add coverage for the cases that we want to verify are not miscompiled. spatel: I realize this will be a little more work, but it would be better to replicate this test with…
				hubert.reinterpretcastAuthorUnsubmitted Done Reply Inline Actions I'm not convinced the coverage is meaningfully being lost. There is already a fairly exhaustive combination of tests in `pow-sqrt.ll`. The `double` version of this test (with no FMF) appears twice already. I meant what I said when I wrote that the changes are made "depending on emphasis in the checks for library calls, avoidance of overlap, and overall coverage". Following that guideline, I found the specific test changes to be rather deterministic. hubert.reinterpretcast: I'm not convinced the coverage is meaningfully being lost. There is already a fairly exhaustive…
				spatelUnsubmitted Done Reply Inline Actions OK. I agree that there are a lot of tests for this transform (because it's been shown buggy even before the bug you found). If we can organize it better that would be great, but that doesn't need to hold up the fix for a miscompile. spatel: OK. I agree that there are a lot of tests for this transform (because it's been shown buggy…
	ret float %retval			ret float %retval
	}			}

	define double @pow_libcall_to_select_sqrt(double %x) {			; Check pow(x, 0.5) where x may be -infinity does not call a library sqrt function.
	; CHECK-LABEL: @pow_libcall_to_select_sqrt(
	; LIB-NEXT: [[SQRT:%.]] = call double @sqrt(double [[X:%.]])			define double @pow_libcall_half_no_FMF(double %x) {
	; LIB-NEXT: [[ABS:%.*]] = call double @llvm.fabs.f64(double [[SQRT]])			; CHECK-LABEL: @pow_libcall_half_no_FMF(
	; LIB-NEXT: [[ISINF:%.*]] = fcmp oeq double [[X]], 0xFFF0000000000000			; CHECK-NEXT: [[POW:%.]] = call double @pow(double [[X:%.]], double 5.000000e-01)
	; LIB-NEXT: [[TMP1:%.*]] = select i1 [[ISINF]], double 0x7FF0000000000000, double [[ABS]]			; CHECK-NEXT: ret double [[POW]]
	; LIB-NEXT: ret double [[TMP1]]
	; NOLIB-NEXT: [[POW:%.]] = call double @pow(double [[X:%.]], double 5.000000e-01)
	; NOLIB-NEXT: ret double [[POW]]
	;			;
	%retval = call double @pow(double %x, double 0.5)			%retval = call double @pow(double %x, double 0.5)
	ret double %retval			ret double %retval
	}			}

	; Check pow(-infinity, 0.5) -> +infinity.			; Check pow(-infinity, 0.5) -> +infinity.

	define float @test_simplify9(float %x) {			define float @test_simplify9(float %x) {
	; CHECK-LABEL: @test_simplify9(			; CHECK-LABEL: @test_simplify9(
	; ANY-NEXT: ret float 0x7FF0000000000000			; CHECK-NEXT: ret float 0x7FF0000000000000
	; VC32-NEXT: [[POW:%.*]] = call float @powf(float 0xFFF0000000000000, float 5.000000e-01)
	; VC32-NEXT: ret float [[POW]]
	; VC51-NEXT: [[POW:%.*]] = call float @powf(float 0xFFF0000000000000, float 5.000000e-01)
	; VC51-NEXT: ret float [[POW]]
	; VC64-NEXT: ret float 0x7FF0000000000000
	; VC83-NEXT: ret float 0x7FF0000000000000
	; NOLIB-NEXT: [[POW:%.*]] = call float @powf(float 0xFFF0000000000000, float 5.000000e-01)
	; NOLIB-NEXT: ret float [[POW]]
	;			;
	%retval = call float @powf(float 0xFFF0000000000000, float 0.5)			%retval = call float @llvm.pow.f32(float 0xFFF0000000000000, float 0.5)
	ret float %retval			ret float %retval
	}			}

	define double @test_simplify10(double %x) {			define double @test_simplify10(double %x) {
	; CHECK-LABEL: @test_simplify10(			; CHECK-LABEL: @test_simplify10(
	; LIB-NEXT: ret double 0x7FF0000000000000			; CHECK-NEXT: ret double 0x7FF0000000000000
	; NOLIB-NEXT: [[POW:%.*]] = call double @pow(double 0xFFF0000000000000, double 5.000000e-01)
	; NOLIB-NEXT: ret double [[POW]]
	;			;
	%retval = call double @pow(double 0xFFF0000000000000, double 0.5)			%retval = call double @llvm.pow.f64(double 0xFFF0000000000000, double 0.5)
	ret double %retval			ret double %retval
	}			}

	; Check pow(x, 1.0) -> x.			; Check pow(x, 1.0) -> x.

	define float @test_simplify11(float %x) {			define float @test_simplify11(float %x) {
	; CHECK-LABEL: @test_simplify11(			; CHECK-LABEL: @test_simplify11(
	; ANY-NEXT: ret float [[X:%.*]]			; ANY-NEXT: ret float [[X:%.*]]
	▲ Show 20 Lines • Show All 152 Lines • ▼ Show 20 Lines
	; CHECK-LABEL: @pow_neg1_double_fastv(			; CHECK-LABEL: @pow_neg1_double_fastv(
	; CHECK-NEXT: [[RECIPROCAL:%.]] = fdiv fast <2 x double> <double 1.000000e+00, double 1.000000e+00>, [[X:%.]]			; CHECK-NEXT: [[RECIPROCAL:%.]] = fdiv fast <2 x double> <double 1.000000e+00, double 1.000000e+00>, [[X:%.]]
	; CHECK-NEXT: ret <2 x double> [[RECIPROCAL]]			; CHECK-NEXT: ret <2 x double> [[RECIPROCAL]]
	;			;
	%r = call fast <2 x double> @llvm.pow.v2f64(<2 x double> %x, <2 x double> <double -1.0, double -1.0>)			%r = call fast <2 x double> @llvm.pow.v2f64(<2 x double> %x, <2 x double> <double -1.0, double -1.0>)
	ret <2 x double> %r			ret <2 x double> %r
	}			}

	define double @test_simplify17(double %x) {			define double @pow_intrinsic_half_no_FMF(double %x) {
	; CHECK-LABEL: @test_simplify17(			; CHECK-LABEL: @pow_intrinsic_half_no_FMF(
	; CHECK-NEXT: [[SQRT:%.]] = call double @llvm.sqrt.f64(double [[X:%.]])			; CHECK-NEXT: [[SQRT:%.]] = call double @llvm.sqrt.f64(double [[X:%.]])
	; CHECK-NEXT: [[ABS:%.*]] = call double @llvm.fabs.f64(double [[SQRT]])			; CHECK-NEXT: [[ABS:%.*]] = call double @llvm.fabs.f64(double [[SQRT]])
	; CHECK-NEXT: [[ISINF:%.*]] = fcmp oeq double [[X]], 0xFFF0000000000000			; CHECK-NEXT: [[ISINF:%.*]] = fcmp oeq double [[X]], 0xFFF0000000000000
	; CHECK-NEXT: [[TMP1:%.*]] = select i1 [[ISINF]], double 0x7FF0000000000000, double [[ABS]]			; CHECK-NEXT: [[TMP1:%.*]] = select i1 [[ISINF]], double 0x7FF0000000000000, double [[ABS]]
	; CHECK-NEXT: ret double [[TMP1]]			; CHECK-NEXT: ret double [[TMP1]]
	;			;
	%retval = call double @llvm.pow.f64(double %x, double 0.5)			%retval = call double @llvm.pow.f64(double %x, double 0.5)
	ret double %retval			ret double %retval
	Show All 25 Lines

llvm/test/Transforms/InstCombine/pow-sqrt.ll

	; NOTE: Assertions have been autogenerated by utils/update_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
	; RUN: opt < %s -instcombine -S \| FileCheck %s			; RUN: opt < %s -instcombine -S \| FileCheck %s

	; Check the libcall and the intrinsic for each case with differing FMF.			; Check the libcall and the intrinsic for each case with differing FMF.

	; The transform to sqrt is allowed as long as we deal with -0.0 and -INF.			; The transform to sqrt is not allowed if we risk setting errno due to -INF.

	define double @pow_libcall_half_no_FMF(double %x) {			define double @pow_libcall_half_no_FMF(double %x) {
	; CHECK-LABEL: @pow_libcall_half_no_FMF(			; CHECK-LABEL: @pow_libcall_half_no_FMF(
	; CHECK-NEXT: [[SQRT:%.]] = call double @sqrt(double [[X:%.]])			; CHECK-NEXT: [[POW:%.]] = call double @pow(double [[X:%.]], double 5.000000e-01)
	; CHECK-NEXT: [[ABS:%.*]] = call double @llvm.fabs.f64(double [[SQRT]])			; CHECK-NEXT: ret double [[POW]]
	; CHECK-NEXT: [[ISINF:%.*]] = fcmp oeq double [[X]], 0xFFF0000000000000
	; CHECK-NEXT: [[TMP1:%.*]] = select i1 [[ISINF]], double 0x7FF0000000000000, double [[ABS]]
	; CHECK-NEXT: ret double [[TMP1]]
	;			;
	%pow = call double @pow(double %x, double 5.0e-01)			%pow = call double @pow(double %x, double 5.0e-01)
	ret double %pow			ret double %pow
	}			}

				; The transform to (non-errno setting) sqrt is allowed as long as we deal with -0.0 and -INF.

	define double @pow_intrinsic_half_no_FMF(double %x) {			define double @pow_intrinsic_half_no_FMF(double %x) {
	; CHECK-LABEL: @pow_intrinsic_half_no_FMF(			; CHECK-LABEL: @pow_intrinsic_half_no_FMF(
	; CHECK-NEXT: [[SQRT:%.]] = call double @llvm.sqrt.f64(double [[X:%.]])			; CHECK-NEXT: [[SQRT:%.]] = call double @llvm.sqrt.f64(double [[X:%.]])
	; CHECK-NEXT: [[ABS:%.*]] = call double @llvm.fabs.f64(double [[SQRT]])			; CHECK-NEXT: [[ABS:%.*]] = call double @llvm.fabs.f64(double [[SQRT]])
	; CHECK-NEXT: [[ISINF:%.*]] = fcmp oeq double [[X]], 0xFFF0000000000000			; CHECK-NEXT: [[ISINF:%.*]] = fcmp oeq double [[X]], 0xFFF0000000000000
	; CHECK-NEXT: [[TMP1:%.*]] = select i1 [[ISINF]], double 0x7FF0000000000000, double [[ABS]]			; CHECK-NEXT: [[TMP1:%.*]] = select i1 [[ISINF]], double 0x7FF0000000000000, double [[ABS]]
	; CHECK-NEXT: ret double [[TMP1]]			; CHECK-NEXT: ret double [[TMP1]]
	;			;
	%pow = call double @llvm.pow.f64(double %x, double 5.0e-01)			%pow = call double @llvm.pow.f64(double %x, double 5.0e-01)
	ret double %pow			ret double %pow
	}			}

	; This makes no difference, but FMF are propagated.			; This makes no difference, but FMF are propagated/retained.
				spatelUnsubmitted Done Reply Inline Actions This comment should be more like above: ; The transform to sqrt is not allowed if we risk setting errno due to -INF. Although that raises a question: if the user has allowed an approximation of pow(), do they really expect that errno would be set accurately? Similarly, if they allowed 'reassoc'... spatel: This comment should be more like above: ; The transform to sqrt is not allowed if we risk…
				hubert.reinterpretcastAuthorUnsubmitted Done Reply Inline Actions Given NaN propagation, if the `sqrt` was okay, then the result here should be NaN. That is, I think the question is more "can we just do the transform and omit the select"? hubert.reinterpretcast: Given NaN propagation, if the `sqrt` was okay, then the result here should be NaN. That is, I…
				hubert.reinterpretcastAuthorUnsubmitted Not Done Reply Inline Actions I've updated the comment. hubert.reinterpretcast: I've updated the comment.

	define double @pow_libcall_half_approx(double %x) {			define double @pow_libcall_half_approx(double %x) {
	; CHECK-LABEL: @pow_libcall_half_approx(			; CHECK-LABEL: @pow_libcall_half_approx(
	; CHECK-NEXT: [[SQRT:%.]] = call afn double @sqrt(double [[X:%.]])			; CHECK-NEXT: [[POW:%.]] = call afn double @pow(double [[X:%.]], double 5.000000e-01)
	; CHECK-NEXT: [[ABS:%.*]] = call afn double @llvm.fabs.f64(double [[SQRT]])			; CHECK-NEXT: ret double [[POW]]
	; CHECK-NEXT: [[ISINF:%.*]] = fcmp afn oeq double [[X]], 0xFFF0000000000000
	; CHECK-NEXT: [[TMP1:%.*]] = select afn i1 [[ISINF]], double 0x7FF0000000000000, double [[ABS]]
	; CHECK-NEXT: ret double [[TMP1]]
	;			;
	%pow = call afn double @pow(double %x, double 5.0e-01)			%pow = call afn double @pow(double %x, double 5.0e-01)
	ret double %pow			ret double %pow
	}			}

	define <2 x double> @pow_intrinsic_half_approx(<2 x double> %x) {			define <2 x double> @pow_intrinsic_half_approx(<2 x double> %x) {
	; CHECK-LABEL: @pow_intrinsic_half_approx(			; CHECK-LABEL: @pow_intrinsic_half_approx(
	; CHECK-NEXT: [[SQRT:%.]] = call afn <2 x double> @llvm.sqrt.v2f64(<2 x double> [[X:%.]])			; CHECK-NEXT: [[SQRT:%.]] = call afn <2 x double> @llvm.sqrt.v2f64(<2 x double> [[X:%.]])
	Show All 32 Lines
	; CHECK-NEXT: [[SQRT:%.]] = call ninf <2 x double> @llvm.sqrt.v2f64(<2 x double> [[X:%.]])			; CHECK-NEXT: [[SQRT:%.]] = call ninf <2 x double> @llvm.sqrt.v2f64(<2 x double> [[X:%.]])
	; CHECK-NEXT: [[ABS:%.*]] = call ninf <2 x double> @llvm.fabs.v2f64(<2 x double> [[SQRT]])			; CHECK-NEXT: [[ABS:%.*]] = call ninf <2 x double> @llvm.fabs.v2f64(<2 x double> [[SQRT]])
	; CHECK-NEXT: ret <2 x double> [[ABS]]			; CHECK-NEXT: ret <2 x double> [[ABS]]
	;			;
	%pow = call ninf <2 x double> @llvm.pow.v2f64(<2 x double> %x, <2 x double> <double 5.0e-01, double 5.0e-01>)			%pow = call ninf <2 x double> @llvm.pow.v2f64(<2 x double> %x, <2 x double> <double 5.0e-01, double 5.0e-01>)
	ret <2 x double> %pow			ret <2 x double> %pow
	}			}

	; If we can disregard -0.0, no need for fabs.			; If we can disregard -0.0, no need for fabs, but still (because of -INF) cannot use library sqrt.

	define double @pow_libcall_half_nsz(double %x) {			define double @pow_libcall_half_nsz(double %x) {
	; CHECK-LABEL: @pow_libcall_half_nsz(			; CHECK-LABEL: @pow_libcall_half_nsz(
	; CHECK-NEXT: [[SQRT:%.]] = call nsz double @sqrt(double [[X:%.]])			; CHECK-NEXT: [[POW:%.]] = call nsz double @pow(double [[X:%.]], double 5.000000e-01)
	; CHECK-NEXT: [[ISINF:%.*]] = fcmp nsz oeq double [[X]], 0xFFF0000000000000			; CHECK-NEXT: ret double [[POW]]
	; CHECK-NEXT: [[TMP1:%.*]] = select nsz i1 [[ISINF]], double 0x7FF0000000000000, double [[SQRT]]
	; CHECK-NEXT: ret double [[TMP1]]
	;			;
	%pow = call nsz double @pow(double %x, double 5.0e-01)			%pow = call nsz double @pow(double %x, double 5.0e-01)
	ret double %pow			ret double %pow
	}			}

	define double @pow_intrinsic_half_nsz(double %x) {			define double @pow_intrinsic_half_nsz(double %x) {
	; CHECK-LABEL: @pow_intrinsic_half_nsz(			; CHECK-LABEL: @pow_intrinsic_half_nsz(
	; CHECK-NEXT: [[SQRT:%.]] = call nsz double @llvm.sqrt.f64(double [[X:%.]])			; CHECK-NEXT: [[SQRT:%.]] = call nsz double @llvm.sqrt.f64(double [[X:%.]])
	▲ Show 20 Lines • Show All 52 Lines • ▼ Show 20 Lines
	; CHECK-LABEL: @pow_libcall_neghalf_no_FMF(			; CHECK-LABEL: @pow_libcall_neghalf_no_FMF(
	; CHECK-NEXT: [[POW:%.]] = call float @powf(float [[X:%.]], float -5.000000e-01)			; CHECK-NEXT: [[POW:%.]] = call float @powf(float [[X:%.]], float -5.000000e-01)
	; CHECK-NEXT: ret float [[POW]]			; CHECK-NEXT: ret float [[POW]]
	;			;
	%pow = call float @powf(float %x, float -5.0e-01)			%pow = call float @powf(float %x, float -5.0e-01)
	ret float %pow			ret float %pow
	}			}

				; If we can disregard INFs, a call to a library sqrt is okay.
	; Transform to sqrt+fdiv because 'reassoc' allows an extra rounding step.			; Transform to sqrt+fdiv because 'reassoc' allows an extra rounding step.
	; Use 'fabs' to handle -0.0 correctly.			; Use 'fabs' to handle -0.0 correctly.
	; Use 'select' to handle -INF correctly.

	define float @pow_libcall_neghalf_reassoc(float %x) {			define float @pow_libcall_neghalf_reassoc_ninf(float %x) {
	; CHECK-LABEL: @pow_libcall_neghalf_reassoc(			; CHECK-LABEL: @pow_libcall_neghalf_reassoc_ninf(
	; CHECK-NEXT: [[SQRTF:%.]] = call reassoc float @sqrtf(float [[X:%.]])			; CHECK-NEXT: [[SQRTF:%.]] = call reassoc ninf float @sqrtf(float [[X:%.]])
	; CHECK-NEXT: [[ABS:%.*]] = call reassoc float @llvm.fabs.f32(float [[SQRTF]])			; CHECK-NEXT: [[ABS:%.*]] = call reassoc ninf float @llvm.fabs.f32(float [[SQRTF]])
	; CHECK-NEXT: [[ISINF:%.*]] = fcmp reassoc oeq float [[X]], 0xFFF0000000000000			; CHECK-NEXT: [[RECIPROCAL:%.*]] = fdiv reassoc ninf float 1.000000e+00, [[ABS]]
	; CHECK-NEXT: [[ABS_OP:%.*]] = fdiv reassoc float 1.000000e+00, [[ABS]]
	; CHECK-NEXT: [[RECIPROCAL:%.*]] = select i1 [[ISINF]], float 0.000000e+00, float [[ABS_OP]]
	; CHECK-NEXT: ret float [[RECIPROCAL]]			; CHECK-NEXT: ret float [[RECIPROCAL]]
	;			;
	%pow = call reassoc float @powf(float %x, float -5.0e-01)			%pow = call reassoc ninf float @powf(float %x, float -5.0e-01)
	ret float %pow			ret float %pow
	}			}

	; Transform to sqrt+fdiv because 'afn' allows an extra rounding step.			; If we cannot disregard INFs, a call to a library sqrt is not okay.
	; Use 'fabs' to handle -0.0 correctly.
	; Use 'select' to handle -INF correctly.

	define float @pow_libcall_neghalf_afn(float %x) {			define float @pow_libcall_neghalf_afn(float %x) {
	; CHECK-LABEL: @pow_libcall_neghalf_afn(			; CHECK-LABEL: @pow_libcall_neghalf_afn(
	; CHECK-NEXT: [[SQRTF:%.]] = call afn float @sqrtf(float [[X:%.]])			; CHECK-NEXT: [[POW:%.]] = call afn float @powf(float [[X:%.]], float -5.000000e-01)
	; CHECK-NEXT: [[ABS:%.*]] = call afn float @llvm.fabs.f32(float [[SQRTF]])			; CHECK-NEXT: ret float [[POW]]
	; CHECK-NEXT: [[ISINF:%.*]] = fcmp afn oeq float [[X]], 0xFFF0000000000000
	; CHECK-NEXT: [[ABS_OP:%.*]] = fdiv afn float 1.000000e+00, [[ABS]]
	; CHECK-NEXT: [[RECIPROCAL:%.*]] = select i1 [[ISINF]], float 0.000000e+00, float [[ABS_OP]]
	; CHECK-NEXT: ret float [[RECIPROCAL]]
	;			;
	%pow = call afn float @powf(float %x, float -5.0e-01)			%pow = call afn float @powf(float %x, float -5.0e-01)
	ret float %pow			ret float %pow
	}			}

	; This should not be transformed without some kind of FMF.			; This should not be transformed without some kind of FMF.

	define <2 x double> @pow_intrinsic_neghalf_no_FMF(<2 x double> %x) {			define <2 x double> @pow_intrinsic_neghalf_no_FMF(<2 x double> %x) {
	▲ Show 20 Lines • Show All 58 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: [[ABS:%.*]] = call ninf afn <2 x double> @llvm.fabs.v2f64(<2 x double> [[SQRT]])			; CHECK-NEXT: [[ABS:%.*]] = call ninf afn <2 x double> @llvm.fabs.v2f64(<2 x double> [[SQRT]])
	; CHECK-NEXT: [[RECIPROCAL:%.*]] = fdiv ninf afn <2 x double> <double 1.000000e+00, double 1.000000e+00>, [[ABS]]			; CHECK-NEXT: [[RECIPROCAL:%.*]] = fdiv ninf afn <2 x double> <double 1.000000e+00, double 1.000000e+00>, [[ABS]]
	; CHECK-NEXT: ret <2 x double> [[RECIPROCAL]]			; CHECK-NEXT: ret <2 x double> [[RECIPROCAL]]
	;			;
	%pow = call afn ninf <2 x double> @llvm.pow.v2f64(<2 x double> %x, <2 x double> <double -5.0e-01, double -5.0e-01>)			%pow = call afn ninf <2 x double> @llvm.pow.v2f64(<2 x double> %x, <2 x double> <double -5.0e-01, double -5.0e-01>)
	ret <2 x double> %pow			ret <2 x double> %pow
	}			}

	; If we can disregard -0.0, no need for fabs.			; If we can disregard -0.0, no need for fabs, but still (because of -INF) cannot use library sqrt.

	define double @pow_libcall_neghalf_nsz(double %x) {			define double @pow_libcall_neghalf_nsz(double %x) {
	; CHECK-LABEL: @pow_libcall_neghalf_nsz(			; CHECK-LABEL: @pow_libcall_neghalf_nsz(
	; CHECK-NEXT: [[SQRT:%.]] = call nsz afn double @sqrt(double [[X:%.]])			; CHECK-NEXT: [[POW:%.]] = call nsz afn double @pow(double [[X:%.]], double -5.000000e-01)
	; CHECK-NEXT: [[ISINF:%.*]] = fcmp nsz afn oeq double [[X]], 0xFFF0000000000000			; CHECK-NEXT: ret double [[POW]]
	; CHECK-NEXT: [[SQRT_OP:%.*]] = fdiv nsz afn double 1.000000e+00, [[SQRT]]
	; CHECK-NEXT: [[RECIPROCAL:%.*]] = select i1 [[ISINF]], double 0.000000e+00, double [[SQRT_OP]]
	; CHECK-NEXT: ret double [[RECIPROCAL]]
	;			;
	%pow = call afn nsz double @pow(double %x, double -5.0e-01)			%pow = call afn nsz double @pow(double %x, double -5.0e-01)
	ret double %pow			ret double %pow
	}			}

	define double @pow_intrinsic_neghalf_nsz(double %x) {			define double @pow_intrinsic_neghalf_nsz(double %x) {
	; CHECK-LABEL: @pow_intrinsic_neghalf_nsz(			; CHECK-LABEL: @pow_intrinsic_neghalf_nsz(
	; CHECK-NEXT: [[SQRT:%.]] = call nsz afn double @llvm.sqrt.f64(double [[X:%.]])			; CHECK-NEXT: [[SQRT:%.]] = call nsz afn double @llvm.sqrt.f64(double [[X:%.]])
	▲ Show 20 Lines • Show All 63 Lines • Show Last 20 Lines

llvm/test/Transforms/InstCombine/win-math.ll

	Show First 20 Lines • Show All 324 Lines • ▼ Show 20 Lines
	; MSVC83: float @sqrtf			; MSVC83: float @sqrtf
	; MSVC83: float @llvm.fabs.f32(			; MSVC83: float @llvm.fabs.f32(
	; MINGW32-NOT: float @powf			; MINGW32-NOT: float @powf
	; MINGW32: float @sqrtf			; MINGW32: float @sqrtf
	; MINGW32: float @llvm.fabs.f32			; MINGW32: float @llvm.fabs.f32
	; MINGW64-NOT: float @powf			; MINGW64-NOT: float @powf
	; MINGW64: float @sqrtf			; MINGW64: float @sqrtf
	; MINGW64: float @llvm.fabs.f32(			; MINGW64: float @llvm.fabs.f32(
	%1 = call float @powf(float %x, float 0.5)			%1 = call ninf float @powf(float %x, float 0.5)
	ret float %1			ret float %1
	}			}

This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] Fix errno bug in pow expansion to sqrtClosedPublic

Details

Diff Detail

Unit TestsFailed

Event Timeline

Revision Contents

Diff 292694

llvm/include/llvm/Transforms/Utils/SimplifyLibCalls.h

llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp

llvm/test/Transforms/InstCombine/pow-1.ll

llvm/test/Transforms/InstCombine/pow-sqrt.ll

llvm/test/Transforms/InstCombine/win-math.ll

[InstCombine] Fix errno bug in pow expansion to sqrt
ClosedPublic