This is an archive of the discontinued LLVM Phabricator instance.

[SimplifyLibCalls] powf(x, sitofp(n)) -> powi(x, n)
ClosedPublic

Authored by xbolva00 on Jun 7 2019, 5:56 PM.

Diff Detail

Event Timeline

xbolva00 created this revision.Jun 7 2019, 5:56 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 7 2019, 5:56 PM

@spatel do you want to see some more tests?

I will precommit them when they are OK.

xbolva00 updated this revision to Diff 203657.Jun 7 2019, 6:07 PM

Removed one invalid Todo comment

xbolva00 marked an inline comment as done.Jun 7 2019, 6:09 PM
xbolva00 added inline comments.
lib/Transforms/Utils/SimplifyLibCalls.cpp
1495

We have no pow(int, int) intrinsic right? So there is no easy way how to do it as follow up...

xbolva00 marked an inline comment as done.Jun 7 2019, 6:14 PM
xbolva00 added inline comments.
lib/Transforms/Utils/SimplifyLibCalls.cpp
1493

Maybe I should move transformation before:
if (Value *Exp = replacePowWithExp(Pow, B))

return Exp;

?

nikic added a subscriber: nikic.EditedJun 8 2019, 12:45 AM

Can you please provide some more information under which circumstances powf(x, (float) y) will provide a different result than powi(x, y), and which fast-math flags specifically are necessary to make that transform legal?

Can you please provide some more information under which circumstances powf(x, (float) y) will provide a different result than powi(x, y), and which fast-math flags specifically are necessary to make that transform legal?

I don’t have such info, I personally think this is fine also without fast math, but since I cannot say it for sure, In the first version, I put it under fast math.

Maybe some experts like @spatel or @efriedma can confirm that this transformation is always fine?

xbolva00 updated this revision to Diff 203682.Jun 8 2019, 4:44 AM

Removed "isFast" requirement for powf(x, sitofp(n)) -> powi(x, n).
New: powf(x, C) -> powi(x, C) iff C is a constant integer value

xbolva00 updated this revision to Diff 203683.Jun 8 2019, 4:49 AM

added new test

xbolva00 updated this revision to Diff 203684.Jun 8 2019, 5:07 AM

Some tests precommited for review.
Rebased.

xbolva00 marked an inline comment as done.Jun 8 2019, 5:10 AM
xbolva00 added inline comments.
test/Transforms/InstCombine/pow_fp_int.ll
304

@spatel @nikic what do you think about this case ?

xbolva00 updated this revision to Diff 203691.Jun 8 2019, 7:26 AM

Handle uitofp too

Can you please provide some more information under which circumstances powf(x, (float) y) will provide a different result than powi(x, y), and which fast-math flags specifically are necessary to make that transform legal?

I don’t have such info, I personally think this is fine also without fast math, but since I cannot say it for sure, In the first version, I put it under fast math.

Maybe some experts like @spatel or @efriedma can confirm that this transformation is always fine?

I don't see how this is valid without some kind of fast-math. What if the integer exponent is not exactly representable as an FP value?

$ cat powi.c 
#include <stdio.h>
#include <math.h>
#include <stdlib.h>

int main(int argc, char *argv[]) {
  float base = atof(argv[1]);
  printf("base as float = %.8f\n", base);

  int exponent = atoi(argv[2]);
  printf("exponent = %d\n", exponent);
  printf("exponent as float = %.8f\n", (float)exponent);

  float d = powf(base, exponent);
  float i = __builtin_powif(base, exponent);
  printf("powf = %f\n", d);
  printf("powif = %f\n", i);
  return 0;
}

$ ./a.out 1.0000001 16777217
base as float = 1.00000012
exponent = 16777217
exponent as float = 16777216.00000000
powf = 7.389055
powif = 7.385338

We definitely need afn for this; powi performs multiple intermediate rounding steps, so it can be significantly less accurate than pow.

Beyond that, we might need nsz in some cases? Probably worth writing a bunch of tests for zero/inf/nan base with zero/positive/negative exponents to figure out exactly which cases are different.

lib/Transforms/Utils/SimplifyLibCalls.cpp
1455

I think you're missing some checks here.

1544

powi takes a signed exponent.

xbolva00 updated this revision to Diff 203892.Jun 10 2019, 1:46 PM

Transform only when full fast math mode.

xbolva00 marked 2 inline comments as done.Jun 10 2019, 1:51 PM
xbolva00 added inline comments.
lib/Transforms/Utils/SimplifyLibCalls.cpp
1455

I think isFast (-Ofast) check is good enough for now.

I wrote some tests with various bases, https://pastebin.com/xpysEY0f.
I got same output for pow and powi.

1544

false means isUnsigned = false. Or if you meant a comment - I added it there more explicitely.

If you meant something else, I don't know what is wrong :(

Some more things to do, or is it fine now? :)

efriedma added inline comments.Jun 12 2019, 11:52 AM
lib/Transforms/Utils/SimplifyLibCalls.cpp
1455

This code doesn't even run in those cases?

I'm specifically concerned about cases where the exponent isn't an int32_t... if it's wider, or unsigned.

1544

That's all I meant; didn't realize that was "isUnsigned"...

xbolva00 updated this revision to Diff 204367.Jun 12 2019, 3:19 PM

More check for exponent int bitwidth.
Added more tests.

xbolva00 marked an inline comment as done.Jun 12 2019, 3:19 PM
xbolva00 added inline comments.
lib/Transforms/Utils/SimplifyLibCalls.cpp
1455

You are right, we need to check it.

xbolva00 updated this revision to Diff 204370.Jun 12 2019, 3:21 PM

Revert unneeded formating changes.

efriedma added inline comments.Jun 12 2019, 3:35 PM
test/Transforms/InstCombine/pow_fp_int.ll
301

I don't think this is right; consider, for example , pow(.999999999,4000000000).

xbolva00 marked an inline comment as done.Jun 12 2019, 3:38 PM
xbolva00 added inline comments.
test/Transforms/InstCombine/pow_fp_int.ll
301

This is a negative test, nothing was changed here.

xbolva00 marked an inline comment as done.Jun 12 2019, 3:40 PM
xbolva00 added inline comments.
test/Transforms/InstCombine/pow_fp_int.ll
301

Yeah, I should change variable naming in negative tests

efriedma added inline comments.Jun 12 2019, 3:47 PM
test/Transforms/InstCombine/pow_fp_int.ll
178

Meant to write a comment on this. Treating "uitofp" like this means you're converting pow(.999999999,4000000000) into pow(.999999999,-294967296).

301

Accidentally commented on the wrong test.

xbolva00 marked an inline comment as done.Jun 12 2019, 4:17 PM
xbolva00 added inline comments.
test/Transforms/InstCombine/pow_fp_int.ll
178

Ah, right.

Can we still do this atleast for some "unsigned" cases, up to i16 (i31?) ?

xbolva00 updated this revision to Diff 204389.Jun 12 2019, 4:50 PM

Handle uitofp better.

xbolva00 marked an inline comment as done.Jun 21 2019, 9:25 AM
xbolva00 added inline comments.
test/Transforms/InstCombine/pow_fp_int.ll
178

It should be ok now. PTAL @efriedma

It would be nice to use the exact necessary fast-math flags here, while we're thinking about it, instead of just "isFast()". From the discussion, it seems like we only need "afn"?

It would be nice to use the exact necessary fast-math flags here, while we're thinking about it, instead of just "isFast()". From the discussion, it seems like we only need "afn"?

Okey. Maybe @spatel could help us with fast flags, whether afn is enough.

It would be nice to use the exact necessary fast-math flags here, while we're thinking about it, instead of just "isFast()". From the discussion, it seems like we only need "afn"?

Okey. Maybe @spatel could help us with fast flags, whether afn is enough.

Yes, I think 'afn' gives us the freedom for this sort of thing. As a practical matter, I'm not sure if clang has the means to turn on 'afn' without the entirety of "-ffast-math", but that may change in the future.

xbolva00 updated this revision to Diff 206066.Jun 21 2019, 1:38 PM

Require just "afn".

It looks like you didn't change all the uses of isFast() in optimizePow?

Also, I'd like to see some performance numbers; I assume powi is faster, but it would be nice to confirm, particularly for larger exponents.

xbolva00 added a comment.EditedJun 21 2019, 2:55 PM

clang -O3 pw.c -lm
xbolva00@xbolva00-G551JW:~$ time ./a.out &> log

real 0m0,195s
user 0m0,195s
sys 0m0,000s
xbolva00@xbolva00-G551JW:~$ time ./a.out &> log

real 0m0,195s
user 0m0,195s
sys 0m0,000s
xbolva00@xbolva00-G551JW:~$ time ./a.out &> log

real 0m0,195s
user 0m0,195s
sys 0m0,000s

clang -Ofast pw.c -lm / clang -O3 -ffast-math pw.c -lm

xbolva00@xbolva00-G551JW:~$ time ./a.out &> log

real 0m0,053s
user 0m0,049s
sys 0m0,004s
xbolva00@xbolva00-G551JW:~$ time ./a.out &> log

real 0m0,050s
user 0m0,050s
sys 0m0,000s
xbolva00@xbolva00-G551JW:~$ time ./a.out &> log

real 0m0,051s
user 0m0,051s
sys 0m0,000s

"Benchmark": https://pastebin.com/Z0yZD4qU

xbolva00 updated this revision to Diff 206089.Jun 21 2019, 3:01 PM

Replace IsFast with AllowApprox

Anything else to address?

efriedma accepted this revision.Jul 1 2019, 4:42 PM

LGTM

lib/Transforms/Utils/SimplifyLibCalls.cpp
1414

Indentation

This revision is now accepted and ready to land.Jul 1 2019, 4:42 PM
xbolva00 updated this revision to Diff 207567.Jul 2 2019, 8:39 AM

Fixed formatting

Anything else to address?

LGTM

Thank you!

This revision was automatically updated to reflect the committed changes.