Download Raw Diff

Details

Reviewers

hfinkel
spatel
efriedma
lebedev.ri

Commits

rG8817658836ac: [InstCombine] Missed optimization in math expression: simplify calls exp…
rL352730: [InstCombine] Missed optimization in math expression: simplify calls exp…

Summary

This patch enables folding following expressions under -ffast-math flag: exp(X) * exp(Y) -> exp(X + Y), exp2(X) * exp2(Y) -> exp2(X + Y). Motivation: https://bugs.llvm.org/show_bug.cgi?id=35594

Diff Detail

Event Timeline

Quolyk created this revision.Dec 18 2017, 1:21 AM

Quolyk mentioned this in D41381: [InstSimplify] Missed optimization in math expression: squashing exp(log), log(exp).Dec 19 2017, 2:21 AM

hfinkel added inline comments.Dec 19 2017, 8:03 PM

lib/Transforms/InstCombine/InstCombineMulDivRem.cpp
737	Needs BuilderTy::FastMathFlagGuard Guard(Builder); first (so that the flags will get unset in the builder as required).
739	Line is too long.

hfinkel added inline comments.Dec 19 2017, 8:13 PM

lib/Transforms/InstCombine/InstCombineMulDivRem.cpp
736	Also, ->hasOneUse() checks here.

Since I'm new to llvm, I looked though other test files, and noticed that it isn't necessary to write full test-check, instead people write just one instruction to make sure test is ok, while consequent instructions doesn't matter. I wonder what style is preferred?

Quolyk marked 3 inline comments as done.Dec 20 2017, 12:56 AM

In D41342#960505, @Quolyk wrote:

Since I'm new to llvm, I looked though other test files, and noticed that it isn't necessary to write full test-check, instead people write just one instruction to make sure test is ok, while consequent instructions doesn't matter. I wonder what style is preferred?

Checks used to be mostly written by hand (and had a lot of shortcomings and bugs because of that). Now that we have scripts to auto-generate checks, I prefer that you use those scripts when possible. It should make subsequent changes or additions to the test file easier for other contributors.

Quolyk updated this revision to Diff 127832.Dec 20 2017, 11:45 PM

Quolyk updated this revision to Diff 128408.Jan 2 2018, 4:22 AM

Quolyk edited the summary of this revision. (Show Details)

Quolyk updated this revision to Diff 129399.Jan 11 2018, 12:00 AM

Quolyk updated this revision to Diff 135633.Feb 23 2018, 6:22 AM

Quolyk edited reviewers, added: efriedma; removed: davide.

Quolyk updated this revision to Diff 155848.Jul 17 2018, 4:13 AM

lebedev.ri added a subscriber: lebedev.ri.Jul 17 2018, 4:26 AM

lebedev.ri added inline comments.

test/Transforms/InstCombine/fmul-exp.ll
16	Please cleanup testcases, name all the variables (`-instnamer` will help) (When committing, please commit tests first, so the code change shows the test change)
31–32	I think this can still be transformed, as long as at least one of them has only one use. This would be transformed into %2 = call fast double @llvm.exp.f64(double %b) call void @use(double %2) %t1 = add fast double %a, %b %t2 = call fast double @llvm.exp.f64(double %t2) ret double %t2 which has the same number of instruction, And add a test where both of them are two-use, which won't be transformed.
test/Transforms/InstCombine/fmul-exp2.ll
63	Please add newlines

Update tests.

Harbormaster completed remote builds in B27127: Diff 182787.Jan 21 2019, 6:42 AM

Quolyk edited the summary of this revision. (Show Details)Jan 21 2019, 6:55 AM

spatel added inline comments.Jan 21 2019, 7:13 AM

test/Transforms/InstCombine/fmul-exp.ll
63	We don't need the full "fast" to enable these transforms. Can you adjust the tests to show the minimal amount of FMF necessary to allow the optimizations? We need "reassoc" + anything else? Do you have commit access? If so, please commit these files to trunk with the current (without this code patch) output.

Update tests.

Harbormaster completed remote builds in B27435: Diff 184064.Jan 29 2019, 5:34 AM

Quolyk marked 3 inline comments as done.Jan 29 2019, 5:37 AM

Quolyk added inline comments.

test/Transforms/InstCombine/fmul-exp.ll
63	I have commit access, but I don't want to commit without patch approval. Revision is fixed according to your comments.

spatel added inline comments.Jan 29 2019, 11:58 AM

test/Transforms/InstCombine/fmul-exp.ll
31–32	This comment has not been addressed. I agree with Roman - if there's only 1 extra use, it makes sense to reduce the dependency chain and reduce fmul to fadd.
63	Thanks. I think the patch is close to approval, so you should commit the tests now. There are 2 benefits to committing the tests first: You and the reviewers can see the actual diffs that will occur with the code change. Ie, if something in a test is not already in canonical form, you can correct or show that in the baseline test commit. If the code change must be reverted for some reason, we should not lose the regression test coverage because of that revert.

Quolyk mentioned this in rL352613: Commit tests for changes in revision D41342.Jan 30 2019, 1:50 AM

Update tests.

Quolyk marked 4 inline comments as done.Jan 30 2019, 5:42 AM

Harbormaster completed remote builds in B27493: Diff 184280.Jan 30 2019, 5:42 AM

Quolyk mentioned this in D41940: [InstSimplify] Missed optimization in math expression: log10(pow(10.0,x)) == x, log2(pow(2.0,x)) == x.Jan 30 2019, 5:51 AM

LGTM, but please wait for @spatel.

test/Transforms/InstCombine/fmul-exp.ll
63	We need "reassoc" + anything else? Any further thoughts here? Just "reassoc" is enough? (that is what is currently being done)

This revision is now accepted and ready to land.Jan 30 2019, 5:51 AM

lebedev.ri added inline comments.Jan 30 2019, 5:55 AM

test/Transforms/InstCombine/fmul-exp2.ll
74	Hm that is interesting, somehow i have never seen that before. So the FMF can be set on such intrinsic calls too. Here, we only need `reassoc` on the original `fmul`, not the `exp`?

Quolyk marked an inline comment as done.Jan 30 2019, 5:59 AM

Quolyk added inline comments.

test/Transforms/InstCombine/fmul-exp2.ll
74	I've been experimenting with this. As @spatel mentioned we need minimum `reassoc` calls here. So as it appears we need all `fmul reassoc`, however `call double` doesn't need `reassoc`.

LGTM

test/Transforms/InstCombine/fmul-exp2.ll
74	Yes, FMF can be set for any call. IIRC, we first needed this for llvm.sqrt, but it can be used with any FP call. And yes, FMF is only required on the fmul here because that value has loosened strictness, so we assume that intermediate values leading up to it may also use that loosened strictness to compute the final result. We should probably make that clearer in the LangRef. The flag requirement that trips me up most often on these is 'nsz', but I think we're safe here: exp2(+/-0.0) * exp2(+/-0.0) --> 1 * 1 --> 1 exp2((+-0.0) + (+/-0.0)) --> exp2(+/- 0.0) --> 1

Closed by commit rL352730: [InstCombine] Missed optimization in math expression: simplify calls exp… (authored by Quolyk). · Explain WhyJan 30 2019, 10:28 PM

This revision was automatically updated to reflect the committed changes.

@spatel @lebedev.ri thank you for review and helping me get through.

Diff 127665

lib/Transforms/InstCombine/InstCombineMulDivRem.cpp

Show First 20 Lines • Show All 722 Lines • ▼ Show 20 Lines	if (OpX && OpY) {
Log2->setArgOperand(0, OpY);		Log2->setArgOperand(0, OpY);
Value *FMulVal = Builder.CreateFMul(OpX, Log2);		Value *FMulVal = Builder.CreateFMul(OpX, Log2);
Value *FSub = Builder.CreateFSub(FMulVal, OpX);		Value *FSub = Builder.CreateFSub(FMulVal, OpX);
FSub->takeName(&I);		FSub->takeName(&I);
return replaceInstUsesWith(I, FSub);		return replaceInstUsesWith(I, FSub);
}		}
}		}

		if (AllowReassociate) {
		Value *Opnd0 = nullptr;
		Value *Opnd1 = nullptr;
		if (Op0->hasOneUse() && Op1->hasOneUse()) {
		BuilderTy::FastMathFlagGuard Guard(Builder);
		Builder.setFastMathFlags(I.getFastMathFlags());
		hfinkelUnsubmitted Done Reply Inline Actions Also, ->hasOneUse() checks here. hfinkel: Also, ->hasOneUse() checks here.

		hfinkelUnsubmitted Done Reply Inline Actions Needs BuilderTy::FastMathFlagGuard Guard(Builder); first (so that the flags will get unset in the builder as required). hfinkel: Needs BuilderTy::FastMathFlagGuard Guard(Builder); first (so that the flags will get unset in…
		// exp(a) * exp(b) -> exp(a + b)
		if (match(Op0, m_Intrinsic<Intrinsic::exp>(m_Value(Opnd0))) &&
		hfinkelUnsubmitted Done Reply Inline Actions Line is too long. hfinkel: Line is too long.
		match(Op1, m_Intrinsic<Intrinsic::exp>(m_Value(Opnd1)))) {
		Value *FAddVal = Builder.CreateFAdd(Opnd0, Opnd1);
		Value *Exp =
		Intrinsic::getDeclaration(I.getModule(), Intrinsic::exp, I.getType());
		Value *ExpCall = Builder.CreateCall(Exp, FAddVal);
		return replaceInstUsesWith(I, ExpCall);
		}

		// exp2(a) * exp2(b) -> exp2(a + b)
		if (match(Op0, m_Intrinsic<Intrinsic::exp2>(m_Value(Opnd0))) &&
		match(Op1, m_Intrinsic<Intrinsic::exp2>(m_Value(Opnd1)))) {
		Value *FAddVal = Builder.CreateFAdd(Opnd0, Opnd1);
		Value *Exp =
		Intrinsic::getDeclaration(I.getModule(), Intrinsic::exp2, I.getType());
		Value *ExpCall = Builder.CreateCall(Exp, FAddVal);
		return replaceInstUsesWith(I, ExpCall);
		}
		}
		}

// Handle symmetric situation in a 2-iteration loop		// Handle symmetric situation in a 2-iteration loop
Value *Opnd0 = Op0;		Value *Opnd0 = Op0;
Value *Opnd1 = Op1;		Value *Opnd1 = Op1;
for (int i = 0; i < 2; i++) {		for (int i = 0; i < 2; i++) {
bool IgnoreZeroSign = I.hasNoSignedZeros();		bool IgnoreZeroSign = I.hasNoSignedZeros();
if (BinaryOperator::isFNeg(Opnd0, IgnoreZeroSign)) {		if (BinaryOperator::isFNeg(Opnd0, IgnoreZeroSign)) {
BuilderTy::FastMathFlagGuard Guard(Builder);		BuilderTy::FastMathFlagGuard Guard(Builder);
Builder.setFastMathFlags(I.getFastMathFlags());		Builder.setFastMathFlags(I.getFastMathFlags());
▲ Show 20 Lines • Show All 901 Lines • Show Last 20 Lines

test/Transforms/InstCombine/fmul-exp.ll

This file was added.

				; RUN: opt -S -instcombine < %s \| FileCheck %s

				declare double @llvm.exp.f64(double) nounwind readnone speculatable
				declare void @use(double)

				; exp(a) * exp(b) no math flags
				; CHECK-LABEL @exp_a_exp_b(
				; CHECK: %1 = call double @llvm.exp.f64(double %a)
				; CHECK: %2 = call double @llvm.exp.f64(double %b)
				; CHECK: %mul = fmul double %1, %2
				; CHECK: ret double %mul
				define double @exp_a_exp_b(double %a, double %b) {
				%1 = call double @llvm.exp.f64(double %a)
				%2 = call double @llvm.exp.f64(double %b)
				%mul = fmul double %1, %2
				ret double %mul
				lebedev.riUnsubmitted Done Reply Inline Actions Please cleanup testcases, name all the variables (`-instnamer` will help) (When committing, please commit tests first, so the code change shows the test change) lebedev.ri: Please cleanup testcases, name all the variables (`-instnamer` will help) (When committing…
				}

				; exp(a) * exp(b) fast-math, multiple uses
				; CHECK-LABEL @exp_a_exp_b_multiple_uses(
				; CHECK: %1 = call fast double @llvm.exp.f64(double %a)
				; CHECK: %2 = call fast double @llvm.exp.f64(double %b)
				; CHECK: %mul = fmul fast double %1, %2
				; CHECK: call void @use(double %2)
				; CHECK: ret double %mul
				define double @exp_a_exp_b_multiple_uses(double %a, double %b) {
				%1 = call fast double @llvm.exp.f64(double %a)
				%2 = call fast double @llvm.exp.f64(double %b)
				%mul = fmul fast double %1, %2
				call void @use(double %2)
				ret double %mul
				}
				lebedev.riUnsubmitted Done Reply Inline Actions I think this can still be transformed, as long as at least one of them has only one use. This would be transformed into %2 = call fast double @llvm.exp.f64(double %b) call void @use(double %2) %t1 = add fast double %a, %b %t2 = call fast double @llvm.exp.f64(double %t2) ret double %t2 which has the same number of instruction, And add a test where both of them are two-use, which won't be transformed. lebedev.ri: I think this can still be transformed, as long as at least one of them has only one use. This…
				spatelUnsubmitted Done Reply Inline Actions This comment has not been addressed. I agree with Roman - if there's only 1 extra use, it makes sense to reduce the dependency chain and reduce fmul to fadd. spatel: This comment has not been addressed. I agree with Roman - if there's only 1 extra use, it makes…

				; exp(a) * exp(b) => exp(a+b) with fast-math
				; CHECK-LABEL @exp_a_exp_b_fast(
				; CHECK: %1 = fadd fast double %a, %b
				; CHECK: %2 = call fast double @llvm.exp.f64(double %1)
				; CHECK: ret double %2
				define double @exp_a_exp_b_fast(double %a, double %b) {
				%1 = call fast double @llvm.exp.f64(double %a)
				%2 = call fast double @llvm.exp.f64(double %b)
				%mul = fmul fast double %1, %2
				ret double %mul
				}

				; exp(a) * exp(b) * exp(c) * exp(d) => exp(a+b+c+d) with fast-math
				; CHECK-LABEL @exp_a_exp_b_exp_c_exp_d_fast(
				; CHECK: %1 = fadd fast double %a, %b
				; CHECK: %2 = fadd fast double %1, %c
				; CHECK: %3 = fadd fast double %2, %d
				; CHECK: %4 = call fast double @llvm.exp.f64(double %3)
				; CHECK: ret double %4
				define double @exp_a_exp_b_exp_c_exp_d_fast(double %a, double %b, double %c, double %d) {
				%1 = call fast double @llvm.exp.f64(double %a)
				%2 = call fast double @llvm.exp.f64(double %b)
				%mul = fmul fast double %1, %2
				%3 = call fast double @llvm.exp.f64(double %c)
				%mul1 = fmul fast double %mul, %3
				%4 = call fast double @llvm.exp.f64(double %d)
				%mul2 = fmul fast double %mul1, %4
				ret double %mul2
				}
				spatelUnsubmitted Done Reply Inline Actions We don't need the full "fast" to enable these transforms. Can you adjust the tests to show the minimal amount of FMF necessary to allow the optimizations? We need "reassoc" + anything else? Do you have commit access? If so, please commit these files to trunk with the current (without this code patch) output. spatel: We don't need the full "fast" to enable these transforms. Can you adjust the tests to show the…
				QuolykAuthorUnsubmitted Done Reply Inline Actions I have commit access, but I don't want to commit without patch approval. Revision is fixed according to your comments. Quolyk: I have commit access, but I don't want to commit without patch approval. Revision is fixed…
				spatelUnsubmitted Done Reply Inline Actions Thanks. I think the patch is close to approval, so you should commit the tests now. There are 2 benefits to committing the tests first: You and the reviewers can see the actual diffs that will occur with the code change. Ie, if something in a test is not already in canonical form, you can correct or show that in the baseline test commit. If the code change must be reverted for some reason, we should not lose the regression test coverage because of that revert. spatel: Thanks. I think the patch is close to approval, so you should commit the tests now. There are 2…
				lebedev.riUnsubmitted Not Done Reply Inline Actions We need "reassoc" + anything else? Any further thoughts here? Just "reassoc" is enough? (that is what is currently being done) lebedev.ri: > We need "reassoc" + anything else? Any further thoughts here? Just "reassoc" is enough?

test/Transforms/InstCombine/fmul-exp2.ll

This file was added.

				; RUN: opt -S -instcombine < %s \| FileCheck %s

				declare double @llvm.exp2.f64(double) nounwind readnone speculatable
				declare void @use(double)

				; exp2(a) * exp2(b) no math flags
				; CHECK-LABEL @exp2_a_exp2_b(
				; CHECK: %1 = call double @llvm.exp2.f64(double %a)
				; CHECK: %2 = call double @llvm.exp2.f64(double %b)
				; CHECK: %mul = fmul double %1, %2
				; CHECK: ret double %mul
				define double @exp2_a_exp2_b(double %a, double %b) {
				%1 = call double @llvm.exp2.f64(double %a)
				%2 = call double @llvm.exp2.f64(double %b)
				%mul = fmul double %1, %2
				ret double %mul
				}

				; exp2(a) * exp2(b) fast-math, multiple uses
				; CHECK-LABEL @exp2_a_exp2_b_multiple_uses(
				; CHECK: %1 = call fast double @llvm.exp2.f64(double %a)
				; CHECK: %2 = call fast double @llvm.exp2.f64(double %b)
				; CHECK: %mul = fmul fast double %1, %2
				; CHECK: call void @use(double %2)
				; CHECK: ret double %mul
				define double @exp2_a_exp2_b_multiple_uses(double %a, double %b) {
				%1 = call fast double @llvm.exp2.f64(double %a)
				%2 = call fast double @llvm.exp2.f64(double %b)
				%mul = fmul fast double %1, %2
				call void @use(double %2)
				ret double %mul
				}

				; exp2(a) * exp2(b) => exp2(a+b) with fast-math
				; CHECK-LABEL @exp2_a_exp2_b_fast(
				; CHECK: %1 = fadd fast double %a, %b
				; CHECK: %2 = call fast double @llvm.exp2.f64(double %1)
				; CHECK: ret double %2
				define double @exp2_a_exp2_b_fast(double %a, double %b) {
				%1 = call fast double @llvm.exp2.f64(double %a)
				%2 = call fast double @llvm.exp2.f64(double %b)
				%mul = fmul fast double %1, %2
				ret double %mul
				}

				; exp2(a) * exp2(b) * exp2(c) * exp2(d) => exp2(a+b+c+d) with fast-math
				; CHECK-LABEL @exp2_a_exp2_b_exp2_c_exp2_d_fast(
				; CHECK: %1 = fadd fast double %a, %b
				; CHECK: %2 = fadd fast double %1, %c
				; CHECK: %3 = fadd fast double %2, %d
				; CHECK: %4 = call fast double @llvm.exp2.f64(double %3)
				; CHECK: ret double %4
				define double @exp2_a_exp2_b_exp2_c_exp2_d(double %a, double %b, double %c, double %d) {
				%1 = call fast double @llvm.exp2.f64(double %a)
				%2 = call fast double @llvm.exp2.f64(double %b)
				%mul = fmul fast double %1, %2
				%3 = call fast double @llvm.exp2.f64(double %c)
				%mul1 = fmul fast double %mul, %3
				%4 = call fast double @llvm.exp2.f64(double %d)
				%mul2 = fmul fast double %mul1, %4
				ret double %mul2
				}
				lebedev.riUnsubmitted Done Reply Inline Actions Please add newlines lebedev.ri: Please add newlines
				lebedev.riUnsubmitted Not Done Reply Inline Actions Hm that is interesting, somehow i have never seen that before. So the FMF can be set on such intrinsic calls too. Here, we only need `reassoc` on the original `fmul`, not the `exp`? lebedev.ri: Hm that is interesting, somehow i have never seen that before. So the FMF can be set on such…
				QuolykAuthorUnsubmitted Done Reply Inline Actions I've been experimenting with this. As @spatel mentioned we need minimum `reassoc` calls here. So as it appears we need all `fmul reassoc`, however `call double` doesn't need `reassoc`. Quolyk: I've been experimenting with this. As @spatel mentioned we need minimum `reassoc` calls here.
				spatelUnsubmitted Not Done Reply Inline Actions Yes, FMF can be set for any call. IIRC, we first needed this for llvm.sqrt, but it can be used with any FP call. And yes, FMF is only required on the fmul here because that value has loosened strictness, so we assume that intermediate values leading up to it may also use that loosened strictness to compute the final result. We should probably make that clearer in the LangRef. The flag requirement that trips me up most often on these is 'nsz', but I think we're safe here: exp2(+/-0.0) * exp2(+/-0.0) --> 1 * 1 --> 1 exp2((+-0.0) + (+/-0.0)) --> exp2(+/- 0.0) --> 1 spatel: Yes, FMF can be set for any call. IIRC, we first needed this for llvm.sqrt, but it can be used…

This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] Missed optimization in math expression: simplify calls exp functions
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 127665

lib/Transforms/InstCombine/InstCombineMulDivRem.cpp

test/Transforms/InstCombine/fmul-exp.ll

test/Transforms/InstCombine/fmul-exp2.ll

This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] Missed optimization in math expression: simplify calls exp functionsClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 127665

lib/Transforms/InstCombine/InstCombineMulDivRem.cpp

test/Transforms/InstCombine/fmul-exp.ll

test/Transforms/InstCombine/fmul-exp2.ll

[InstCombine] Missed optimization in math expression: simplify calls exp functions
ClosedPublic