This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] Missed optimization in math expression: simplify calls exp functions
ClosedPublic

Authored by Quolyk on Dec 18 2017, 1:21 AM.

Details

Summary

This patch enables folding following expressions under -ffast-math flag: exp(X) * exp(Y) -> exp(X + Y), exp2(X) * exp2(Y) -> exp2(X + Y). Motivation: https://bugs.llvm.org/show_bug.cgi?id=35594

Diff Detail

Event Timeline

Quolyk created this revision.Dec 18 2017, 1:21 AM
hfinkel added inline comments.Dec 19 2017, 8:03 PM
lib/Transforms/InstCombine/InstCombineMulDivRem.cpp
737

Needs

BuilderTy::FastMathFlagGuard Guard(Builder);

first (so that the flags will get unset in the builder as required).

739

Line is too long.

hfinkel added inline comments.Dec 19 2017, 8:13 PM
lib/Transforms/InstCombine/InstCombineMulDivRem.cpp
736

Also, ->hasOneUse() checks here.

Quolyk updated this revision to Diff 127665.Dec 20 2017, 12:55 AM
Quolyk retitled this revision from [WIP][InstCombine] Missed optimization in math expression: simplify calls exp functions to [InstCombine] Missed optimization in math expression: simplify calls exp functions.

Since I'm new to llvm, I looked though other test files, and noticed that it isn't necessary to write full test-check, instead people write just one instruction to make sure test is ok, while consequent instructions doesn't matter. I wonder what style is preferred?

Quolyk marked 3 inline comments as done.Dec 20 2017, 12:56 AM

Since I'm new to llvm, I looked though other test files, and noticed that it isn't necessary to write full test-check, instead people write just one instruction to make sure test is ok, while consequent instructions doesn't matter. I wonder what style is preferred?

Checks used to be mostly written by hand (and had a lot of shortcomings and bugs because of that). Now that we have scripts to auto-generate checks, I prefer that you use those scripts when possible. It should make subsequent changes or additions to the test file easier for other contributors.

Quolyk updated this revision to Diff 127832.Dec 20 2017, 11:45 PM
Quolyk updated this revision to Diff 128408.Jan 2 2018, 4:22 AM
Quolyk edited the summary of this revision. (Show Details)
Quolyk updated this revision to Diff 129399.Jan 11 2018, 12:00 AM
Quolyk updated this revision to Diff 135633.Feb 23 2018, 6:22 AM
Quolyk edited reviewers, added: efriedma; removed: davide.
Quolyk updated this revision to Diff 155848.Jul 17 2018, 4:13 AM
lebedev.ri added inline comments.
test/Transforms/InstCombine/fmul-exp.ll
16

Please cleanup testcases, name all the variables (-instnamer will help)
(When committing, please commit tests first, so the code change shows the test change)

31–32

I think this can still be transformed, as long as at least one of them has only one use.
This would be transformed into

%2 = call fast double @llvm.exp.f64(double %b)
call void @use(double %2)
%t1 = add fast double %a, %b
%t2 = call fast double @llvm.exp.f64(double %t2)
ret double %t2

which has the same number of instruction,
And add a test where both of them are two-use, which won't be transformed.

test/Transforms/InstCombine/fmul-exp2.ll
63

Please add newlines

Quolyk updated this revision to Diff 182787.Jan 21 2019, 6:42 AM

Update tests.

Quolyk edited the summary of this revision. (Show Details)Jan 21 2019, 6:55 AM
spatel added inline comments.Jan 21 2019, 7:13 AM
test/Transforms/InstCombine/fmul-exp.ll
63

We don't need the full "fast" to enable these transforms. Can you adjust the tests to show the minimal amount of FMF necessary to allow the optimizations? We need "reassoc" + anything else?

Do you have commit access? If so, please commit these files to trunk with the current (without this code patch) output.

Quolyk updated this revision to Diff 184064.Jan 29 2019, 5:34 AM

Update tests.

Quolyk marked 3 inline comments as done.Jan 29 2019, 5:37 AM
Quolyk added inline comments.
test/Transforms/InstCombine/fmul-exp.ll
63

I have commit access, but I don't want to commit without patch approval. Revision is fixed according to your comments.

spatel added inline comments.Jan 29 2019, 11:58 AM
test/Transforms/InstCombine/fmul-exp.ll
31–32

This comment has not been addressed. I agree with Roman - if there's only 1 extra use, it makes sense to reduce the dependency chain and reduce fmul to fadd.

63

Thanks. I think the patch is close to approval, so you should commit the tests now. There are 2 benefits to committing the tests first:

  1. You and the reviewers can see the actual diffs that will occur with the code change. Ie, if something in a test is not already in canonical form, you can correct or show that in the baseline test commit.
  2. If the code change must be reverted for some reason, we should not lose the regression test coverage because of that revert.
Quolyk updated this revision to Diff 184280.Jan 30 2019, 5:40 AM
Quolyk marked an inline comment as done.

Update tests.

Quolyk marked 4 inline comments as done.Jan 30 2019, 5:42 AM
lebedev.ri accepted this revision.Jan 30 2019, 5:51 AM

LGTM, but please wait for @spatel.

test/Transforms/InstCombine/fmul-exp.ll
63

We need "reassoc" + anything else?

Any further thoughts here? Just "reassoc" is enough?
(that is what is currently being done)

This revision is now accepted and ready to land.Jan 30 2019, 5:51 AM
lebedev.ri added inline comments.Jan 30 2019, 5:55 AM
test/Transforms/InstCombine/fmul-exp2.ll
74

Hm that is interesting, somehow i have never seen that before.
So the FMF can be set on such intrinsic calls too.
Here, we only need reassoc on the original fmul, not the exp?

Quolyk marked an inline comment as done.Jan 30 2019, 5:59 AM
Quolyk added inline comments.
test/Transforms/InstCombine/fmul-exp2.ll
74

I've been experimenting with this. As @spatel mentioned we need minimum reassoc calls here. So as it appears we need all fmul reassoc, however call double doesn't need reassoc.

spatel accepted this revision.Jan 30 2019, 8:02 AM

LGTM

test/Transforms/InstCombine/fmul-exp2.ll
74

Yes, FMF can be set for any call. IIRC, we first needed this for llvm.sqrt, but it can be used with any FP call.

And yes, FMF is only required on the fmul here because that value has loosened strictness, so we assume that intermediate values leading up to it may also use that loosened strictness to compute the final result. We should probably make that clearer in the LangRef.

The flag requirement that trips me up most often on these is 'nsz', but I think we're safe here:
exp2(+/-0.0) * exp2(+/-0.0) --> 1 * 1 --> 1
exp2((+-0.0) + (+/-0.0)) --> exp2(+/- 0.0) --> 1

This revision was automatically updated to reflect the committed changes.

@spatel @lebedev.ri thank you for review and helping me get through.