Page MenuHomePhabricator

[InstCombine] Fix errno bug in pow expansion to sqrt

Authored by hubert.reinterpretcast on Sep 17 2020, 7:58 PM.



A conversion from pow to sqrt shall not call an errno-setting sqrt with -infinity: the sqrt will set EDOM where the pow call need not.

This patch avoids the erroneous (pun not intended) transformation by applying the restrictions discussed in the thread for

The existing tests are updated (depending on emphasis in the checks for library calls, avoidance of overlap, and overall coverage):

  • to add ninf, retaining the intended library call,
  • to use the intrinsic, retaining the use of select, or
  • to expect the replacement to not occur.

The following is tested:

  • The pow intrinsic folds to a select instruction to handle -infinity.
  • The pow library call folds, with ninf, to sqrt without the select instruction associated with handling -infinity.
  • The pow library call does not fold to sqrt without ninf.

Diff Detail

Event Timeline

Herald added a project: Restricted Project. · View Herald TranscriptSep 17 2020, 7:58 PM
Herald added a subscriber: hiraditya. · View Herald Transcript
hubert.reinterpretcast requested review of this revision.Sep 17 2020, 7:58 PM
hubert.reinterpretcast edited the summary of this revision. (Show Details)Sep 18 2020, 11:01 AM
spatel added inline comments.Sep 21 2020, 5:54 AM

This transform is making this patch more complicated, right?
How about adding an explicit check to avoid it for the single sqrt case (that could be a preliminary NFC cleanup patch if you prefer). Then we just add the more obvious correctness check inside of replacePowWithSqrt():

diff --git a/llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp b/llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp
index 60b7da7e64f..4ce580f47ad 100644
--- a/llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp
+++ b/llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp
@@ -1636,6 +1636,14 @@ Value *LibCallSimplifier::replacePowWithSqrt(CallInst *Pow, IRBuilderBase &B) {
   if (ExpoF->isNegative() && (!Pow->hasApproxFunc() && !Pow->hasAllowReassoc()))
     return nullptr;
+  // If we have a pow() library call (accesses memory) and we can't guarantee
+  // that the base is not Inf, give up:
+  // pow(-Inf, 0.5) does not set errno (result is +Inf as shown below), but
+  // sqrt(-Inf) may set errno.
+  if (!Pow->doesNotAccessMemory() && !Pow->hasNoInfs() &&
+      !isKnownNeverInfinity(Base, TLI))
+    return nullptr;
   Sqrt = getSqrtCall(Base, Attrs, Pow->doesNotAccessMemory(), Mod, B, TLI);
   if (!Sqrt)
     return nullptr;
@@ -1722,7 +1730,8 @@ Value *LibCallSimplifier::optimizePow(CallInst *Pow, IRBuilderBase &B) {
   // pow(x, n) -> x * x * x * ...
   const APFloat *ExpoF;
-  if (AllowApprox && match(Expo, m_APFloat(ExpoF))) {
+  if (AllowApprox && match(Expo, m_APFloat(ExpoF)) &&
+      !ExpoF->isExactlyValue(0.5) && !ExpoF->isExactlyValue(-0.5)) {
     // We limit to a max of 7 multiplications, thus the maximum exponent is 32.
     // If the exponent is an integer+0.5 we generate a call to sqrt and an
     // additional fmul.

I realize this will be a little more work, but it would be better to replicate this test with the additional FMF (and similarly for other tests that are changing the flags and/or libcall/intrinsic).

That way, we'll retain the likely original intent of the test and add coverage for the cases that we want to verify are not miscompiled.


I'm not convinced the coverage is meaningfully being lost. There is already a fairly exhaustive combination of tests in pow-sqrt.ll. The double version of this test (with no FMF) appears twice already.

I meant what I said when I wrote that the changes are made "depending on emphasis in the checks for library calls, avoidance of overlap, and overall coverage". Following that guideline, I found the specific test changes to be rather deterministic.

spatel added inline comments.Sep 21 2020, 8:27 AM

OK. I agree that there are a lot of tests for this transform (because it's been shown buggy even before the bug you found). If we can organize it better that would be great, but that doesn't need to hold up the fix for a miscompile.


This comment should be more like above:

; The transform to sqrt is not allowed if we risk setting errno due to -INF.

Although that raises a question: if the user has allowed an approximation of pow(), do they really expect that errno would be set accurately? Similarly, if they allowed 'reassoc'...

hubert.reinterpretcast marked 2 inline comments as done.Sep 21 2020, 8:35 AM
hubert.reinterpretcast added inline comments.

Given NaN propagation, if the sqrt was okay, then the result here should be NaN. That is, I think the question is more "can we just do the transform and omit the select"?


I'll look into committing the check as an NFC cleanup. I'll probably also move this transform into its own function (like the earlier transforms) as an NFC cleanup too. That would make the (possibly unintended) different handling of the early exit cases from this transformation (compared to the other ones) more obvious.


Splitting this part out is indeed a good idea. The change is not NFC and actually fixes a problem:

LLVM ERROR: Instruction Combining seems stuck in an infinite loop after 100 iterations.

on a case (with opt -instcombine -S -disable-builtin sqrt) like:

; (float)pow((double)(float)x, 0.5)
define float @shrink_pow_libcall_half(float %x) {
  %dx = fpext float %x to double
  %call = call fast double @pow(double %dx, double 0.5)
  %fr = fptrunc double %call to float
  ret float %fr
hubert.reinterpretcast marked an inline comment as done.
  • Rebase on top of D88066; add isKnownNeverInfinity condition
  • Adjust comment for afn case
hubert.reinterpretcast marked an inline comment as done.Sep 22 2020, 9:17 AM
hubert.reinterpretcast added inline comments.

@spatel, I've added the isKnownNeverInfinity check here as suggested in your comment below. I am not entirely sure whether it is quite effective though. It seems this query may happening "too early".

Example case I tried (with opt -instcombine):

define double @pow_libcall_half_no_FMF_base_ninf(i32 %x) {
  %conv = sitofp i32 %x to double
  %pow = call double @pow(double %conv, double 5.0e-01)
  ret double %pow

I've updated the comment.

spatel accepted this revision.Sep 22 2020, 12:51 PM



This is yet another bug independent of this patch:

$ opt -instsimplify inf.ll -S
define i1 @src(i32 %x) {
  %conv = sitofp i32 %x to double
  %r = fcmp oeq double %conv, 0x7FF0000000000000
  ret i1 %r
$ opt -instcombine inf.ll -S
define i1 @src(i32 %x) {
  ret i1 false
So instcombine knows that transform, but ValueTracking/instsimplify do not. I'll put that on my TODO list. If you change your example to use "uitofp" it should work as expected.

This revision is now accepted and ready to land.Sep 22 2020, 12:51 PM
This revision was landed with ongoing or failed builds.Sep 22 2020, 4:00 PM
This revision was automatically updated to reflect the committed changes.

I haven't been following this closely, but is there some reason we can't transform powf(x, 0.5) to sqrt(x == -infinity ? qnan : x)?

I haven't been following this closely, but is there some reason we can't transform powf(x, 0.5) to sqrt(x == -infinity ? qnan : x)?

Hmm, moving the select inwards should work, yes: sqrt(x == -infinity ? +infinity : x).