Page MenuHomePhabricator

[InstCombine] Limit FMul constant folding for fma simplifications.
AcceptedPublic

Authored by fhahn on Wed, Sep 11, 3:03 AM.

Details

Summary

As @reames pointed out post-commit, rL371518 adds additional rounding
in some cases, when doing constant folding of the multiplication.
This breaks a guarantee llvm.fma makes and must be avoided.

This patch reapplies rL371518, but splits off the simplifications not
requiring rounding from SimplifFMulInst as SimplifyFMAFMul.

Diff Detail

Event Timeline

fhahn created this revision.Wed, Sep 11, 3:03 AM
Herald added a project: Restricted Project. · View Herald TranscriptWed, Sep 11, 3:03 AM
Herald added a subscriber: hiraditya. · View Herald Transcript
lebedev.ri added inline comments.Wed, Sep 11, 3:05 AM
llvm/test/Transforms/InstCombine/fma.ll
502–511

I don't think this restriction should apply to @llvm.fmuladd, as per https://llvm.org/docs/LangRef.html#llvm-fmuladd-intrinsic

reames requested changes to this revision.Wed, Sep 11, 9:02 AM

First, please revert the previous patch until this review concludes. We have an active miscompile in tree, and that should be addressed first.

Second, I'm not a fan of this approach. It feels very fragile. For example, what if someone adds a transform which exploits rounding but doesn't involve constants? (Say, we know some bits in one of the arguments...)

I would *strongly* suggest we restructure the code such that a common utility function is used, but with two cover functions which make the appropriate guarantees about rounding for the distinct cases.

llvm/test/Transforms/InstCombine/fma.ll
502–511

I don't see why not? The rounding is specified exactly as described.

This revision now requires changes to proceed.Wed, Sep 11, 9:02 AM
lebedev.ri added inline comments.Wed, Sep 11, 9:07 AM
llvm/test/Transforms/InstCombine/fma.ll
502–511

I think my comment got garbled. I'm saying the current patch is correct in the sense that
if we are simplifying fmuladd, we don't have any rounding concerns:
As per LangRef, fmuladd can be arbitrarily expanded to fmul+fadd, unlike fma.

reames added inline comments.Wed, Sep 11, 9:12 AM
llvm/test/Transforms/InstCombine/fma.ll
502–511

From the langref: "is equivalent to the expression a * b + c, *except that rounding will not be performed* between the multiplication and addition steps if the code generator fuses the operations. "

Once fused, this implies the rounding is defined. We can consider changing the LangRef - I think we maybe should - but per the actual wording, we must do the rounding as per an fma, not the component operations.

reames added inline comments.Wed, Sep 11, 9:15 AM
llvm/test/Transforms/InstCombine/fma.ll
502–511

p.s. Looking at the implementation, particularly SelectionDAGBuilder, I think you're right about the intent. We just need to clarify the semantics in LangRef.

lebedev.ri added inline comments.Wed, Sep 11, 9:18 AM
llvm/test/Transforms/InstCombine/fma.ll
502–511

I fail to read langref any other way already:

The ‘llvm.fmuladd.*’ intrinsic functions represent multiply-add expressions that can be fused if the code generator determines that (a) the target instruction set has support for a fused operation, and (b) that the fused operation is more efficient than the equivalent, separate pair of mul and add instructions.

"can be fused if"

The expression:
%0 = call float @llvm.fmuladd.f32(%a, %b, %c)
is equivalent to the expression a * b + c, except that rounding will not be performed between the multiplication and addition steps if the code generator fuses the operations. Fusion is not guaranteed, even if the target platform supports it. If a fused multiply-add is required, the corresponding llvm.fma intrinsic function should be used instead.

"rounding will not be performed between the multiplication and addition steps if the code generator fuses the operations"
"Fusion is not guaranteed"

fhahn added a comment.Wed, Sep 11, 9:20 AM

First, please revert the previous patch until this review concludes. We have an active miscompile in tree, and that should be addressed first.

Sure, done.

Second, I'm not a fan of this approach. It feels very fragile. For example, what if someone adds a transform which exploits rounding but doesn't involve constants? (Say, we know some bits in one of the arguments...)

I would *strongly* suggest we restructure the code such that a common utility function is used, but with two cover functions which make the appropriate guarantees about rounding for the distinct cases.

Will do, thanks.

reames added inline comments.Wed, Sep 11, 9:42 AM
llvm/test/Transforms/InstCombine/fma.ll
502–511

The key bit of wording here is that the rounding is described *following the decision to fold*.

I'm not disputing that converting " a * b + c" to a fmuladd is valid per this description. I'm disputing that converting *back* to an "a * b + c" sequence is technically disallowed by the wording.

fhahn updated this revision to Diff 219737.Wed, Sep 11, 9:54 AM
fhahn edited the summary of this revision. (Show Details)

Split off simplifications not requiring rounding from SimplifFMulInst as SimplifyFMAFMul.

fhahn marked an inline comment as done.Wed, Sep 11, 9:57 AM
fhahn added inline comments.
llvm/test/Transforms/InstCombine/fma.ll
502–511

The key bit of wording here is that the rounding is described *following the decision to fold*.

My understanding is that as long as it is a fmuladd, the code generator did not make a decision about fusing yet. IIUC there are 2 ways it can decide to fuse: 1) replace it with an fma intrinsic or lower it to a FMA in the backend.

lebedev.ri added inline comments.Wed, Sep 11, 10:04 AM
llvm/test/Transforms/InstCombine/fma.ll
502–511

I would agree with that if not for the

Fusion is not guaranteed, even if the target platform supports it. If a fused multiply-add is required, the corresponding llvm.fma intrinsic function should be used instead.

to me that sounds: "use if fmulfadd if you want to request to do FMA; or fma if you want to require FMA (if unavailable compilation will fail)"

reames accepted this revision.Wed, Sep 11, 10:18 AM

LGTM, subject to incorporate all comments below before landing.

llvm/include/llvm/Analysis/InstructionSimplify.h
146

Suggested tweak: In contrast to SimplifyFMulInst, this function will not perform simplifications whose unrounded results differ when rounded to the argument type.

llvm/lib/Analysis/InstructionSimplify.cpp
4536

Please don't repeat the doc comment. Just in the header is fine.

4545

These two appear to be new transforms. Please separate the basic optimization which is mostly refactoring plus hooks (this review) and a following optimization change.

llvm/test/Transforms/InstCombine/fma.ll
502–511

Ok, let me rephrase how I'm approaching this.

As a reviewer, I have shared a concern about a potential ambiguity in the LangRef which effects the legality of the discussed transform. Whether there is an agreement that the ambiguity exists or not, I expect a change made to clarify the semantics to remove the potential ambiguity. Please consider this a hard requirement before any change to exploit the discussed semantics as proposed in the comment that started this sub-thread.

(This entire sub-thread is off topic for the actual review.)

This revision is now accepted and ready to land.Wed, Sep 11, 10:18 AM
fhahn marked an inline comment as done.Thu, Sep 12, 12:35 PM
fhahn added inline comments.
llvm/test/Transforms/InstCombine/fma.ll
502–511

Sounds good, I'll think about how to make it more explicit.

fhahn marked 2 inline comments as done.Fri, Sep 13, 7:37 AM
fhahn added inline comments.
llvm/lib/Analysis/InstructionSimplify.cpp
4545

Done, split off as D67553

llvm/test/Transforms/InstCombine/fma.ll
502–511

Split off as D67552