This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] Missed optimization for pow(x, y) * pow(x, z) with fast-math
ClosedPublic

Authored by vdsered on May 16 2021, 4:03 AM.

Details

Summary

If fast-math is set, then LLVM is free to do the following transformation pow(x, y) * pow(x, z) -> pow(x, y + z). This patch adds this transformation and tests for it. See more https://bugs.llvm.org/show_bug.cgi?id=47205

It handles two cases

  1. When operands of fmul are different instructions

%4 = call reassoc float @llvm.pow.f32(float %0, float %1)
%5 = call reassoc float @llvm.pow.f32(float %0, float %2)
%6 = fmul reassoc float %5, %4
-->
%3 = fadd reassoc float %1, %2
%4 = call reassoc float @llvm.pow.f32(float %0, float %3)

  1. When operands of fmul are the same instruction

%4 = call reassoc float @llvm.pow.f32(float %0, float %1)
%5 = fmul reassoc float %4, %4
-->
%3 = fadd reassoc float %1, %1
%4 = call reassoc float @llvm.pow.f32(float %0, float %3)

Diff Detail

Event Timeline

vdsered created this revision.May 16 2021, 4:03 AM
vdsered requested review of this revision.May 16 2021, 4:03 AM
vdsered updated this revision to Diff 345698.May 16 2021, 4:36 AM

Calling CreateBinaryIntrinsic instead of CreateIntrinsic for better readability

I have noticed that there are two similiar transformations below exp2(X) * exp2(Y) -> exp2(X + Y) and exp(X) * exp(Y) -> exp(X + Y) that are done when each operand of multiplication has exactly one use which is n't true if, for example, there is exp2(X) * exp2(X) where exp2(X) is the same instruction, so it can be improved a little. I'm going to update the patch

I have noticed that there are two similiar transformations below exp2(X) * exp2(Y) -> exp2(X + Y) and exp(X) * exp(Y) -> exp(X + Y) that are done when each operand of multiplication has exactly one use which is n't true if, for example, there is exp2(X) * exp2(X) where exp2(X) is the same instruction, so it can be improved a little. I'm going to update the patch

I recommend that you make that use-check logic into a helper function as a preliminary step, so it can be be accessed from other places. For example, we miss this transform too because the use check is too restrictive:
https://alive2.llvm.org/ce/z/27ryac

I have noticed that there are two similiar transformations below exp2(X) * exp2(Y) -> exp2(X + Y) and exp(X) * exp(Y) -> exp(X + Y) that are done when each operand of multiplication has exactly one use which is n't true if, for example, there is exp2(X) * exp2(X) where exp2(X) is the same instruction, so it can be improved a little. I'm going to update the patch

I recommend that you make that use-check logic into a helper function as a preliminary step, so it can be be accessed from other places. For example, we miss this transform too because the use check is too restrictive:
https://alive2.llvm.org/ce/z/27ryac

Are you talking about all checks that we do here? (see below)

Op0->hasOneUse() || Op1->hasOneUse() || (Op1 == Op0 && Op1->hasNUses(2))

Because I think this condition is going to be the same for many patterns and I will create another patch to implement the helper function and refactor conditions at least for those transformations (below) before updating this one

  1. exp(X) * exp(Y) -> exp(X + Y)
  2. exp2(X) * exp2(Y) -> exp2(X + Y)
  3. sqrt(X) * sqrt(Y) -> sqrt(X * Y)

There are more cases when this sort of check is useful like in InstCombinerImpl::SimplifyAssociativeOrCommutative

// Transform: "(A op C1) op (B op C2)" ==> "(A op B) op (C1 op C2)"
// if C1 and C2 are constants.
Value *A, *B;
Constant *C1, *C2;
if (Op0 && Op1 &&
    Op0->getOpcode() == Opcode && Op1->getOpcode() == Opcode &&
    match(Op0, m_OneUse(m_BinOp(m_Value(A), m_Constant(C1)))) &&
    match(Op1, m_OneUse(m_BinOp(m_Value(B), m_Constant(C2))))) {

The only difference is how op0 and op1 are compared (via opcode), but the use-check can be relaxed too as I understand

I have noticed that there are two similiar transformations below exp2(X) * exp2(Y) -> exp2(X + Y) and exp(X) * exp(Y) -> exp(X + Y) that are done when each operand of multiplication has exactly one use which is n't true if, for example, there is exp2(X) * exp2(X) where exp2(X) is the same instruction, so it can be improved a little. I'm going to update the patch

I recommend that you make that use-check logic into a helper function as a preliminary step, so it can be be accessed from other places. For example, we miss this transform too because the use check is too restrictive:
https://alive2.llvm.org/ce/z/27ryac

Are you talking about all checks that we do here? (see below)

Op0->hasOneUse() || Op1->hasOneUse() || (Op1 == Op0 && Op1->hasNUses(2))

Yes - let's give that line a name (not sure what's best - "hasRemovableUser" ?), so we don't have to repeat it.

Because I think this condition is going to be the same for many patterns and I will create another patch to implement the helper function and refactor conditions at least for those transformations (below) before updating this one

  1. exp(X) * exp(Y) -> exp(X + Y)
  2. exp2(X) * exp2(Y) -> exp2(X + Y)
  3. sqrt(X) * sqrt(Y) -> sqrt(X * Y)

Yes - please make a different patch. It's probably only mul/fmul that have a need for this. Most other binops should simplify if both operands are the same.
For example, the sqrt pattern should get folded by instsimplify, so I don't think that needs any changes.

I created a new patch https://reviews.llvm.org/D102698. I will update the current patch when we close the new one.

vdsered updated this revision to Diff 349008.EditedJun 1 2021, 10:20 AM
vdsered edited the summary of this revision. (Show Details)
  1. Replaced hasOneUse + equality of operands with isOnlyUserOfAnyOperand
  2. FMF are added to produced fadd
  3. Added negative tests
vdsered added a comment.EditedJun 1 2021, 10:27 AM

@spatel you mentioned one more optimization with sext that we miss because of too strict use-constraint here. I'll create another patch for that. I think, it's going to be the last one. I don't know any other transformations for mul/fmul that would benefit from the new method.

spatel added a comment.EditedJun 4 2021, 7:18 AM

@spatel you mentioned one more optimization with sext that we miss because of too strict use-constraint here. I'll create another patch for that. I think, it's going to be the last one. I don't know any other transformations for mul/fmul that would benefit from the new method.

Yes, that was:
https://alive2.llvm.org/ce/z/27ryac
...and it would be another patch

I pushed the baseline tests for this patch with:
f03f4944cf

Please update/rebase here in Phab, so we'll see just the diffs from this patch.

vdsered updated this revision to Diff 350047.Jun 5 2021, 2:51 AM
xbolva00 added inline comments.
llvm/test/Transforms/InstCombine/fmul-pow.ll
101

Restore

spatel added inline comments.Jun 6 2021, 4:29 PM
llvm/test/Transforms/InstCombine/fmul-pow.ll
89–100

Do you know why you are getting this (and similar) test changes? They are cosmetic (value names differ), but I don't see this when testing locally.

vdsered updated this revision to Diff 350162.Jun 6 2021, 6:44 PM

Removed name diff in tests

vdsered added inline comments.Jun 6 2021, 6:48 PM
llvm/test/Transforms/InstCombine/fmul-pow.ll
89–100

I have seen recently how update_test_checks warned about using tmps in tests so I thought of renaming. It doesn't say anything for these tests however... Updated the patch to remove the naming diff and see the actual changes in output for tests

spatel accepted this revision.Jun 7 2021, 3:29 AM

LGTM

llvm/test/Transforms/InstCombine/fmul-pow.ll
89–100

Ah - yes, the script will warn if it sees IR like:

%tmp1 = fmul float %x, %y

...because the script is creating FileCheck variables for unnamed values in IR (%1) with names like TMP1 (as we see in the examples in this patch). I don't see any potential conflicts with these tests.

This revision is now accepted and ready to land.Jun 7 2021, 3:29 AM
This comment was removed by vdsered.

Sorry, I didn't see you have already committed :)