This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] fma x, y, 0 -> fmul x, y
ClosedPublic

Authored by dmgreen on Jun 29 2020, 8:53 AM.

Details

Summary

If the addend of the fma is zero, common sense would suggest that we can convert fma x, y, 0.0 to fmul x, y. This comes up with some user code that was expecting the first fma in an unrolled loop to simplify to a fmul.

Floating point often does not follow naive common sense though. Alive suggests that this should be guarded by nsz (as fadd -0.0, 0.0 = 0.0). However it also did not complete running, so I have only validated this against fadd nsz (fmul(x,y), 0) -> fmul nsz (x,y), not with fma nsz (x, y, 0.0) -> fmul nsz (x, y).

Diff Detail

Event Timeline

dmgreen created this revision.Jun 29 2020, 8:53 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 29 2020, 8:53 AM

I don't think I see why this needs an nsz check (but think it could also fold -0 in the 3rd operand with one). The fma should only be able to produce a negative 0 if the 3rd operand is also a -0

fadd(+0, -0) -> +0
fadd(-0, -0) -> -0

fmul(-0, +x) -> -0
fmul(+0, -x) -> -0
fmul(-0, -x) -> 0

fma(+x, -0, 0) -> +0
fma(-x, +0, 0) -> +0
fma(-x, -0, 0) -> +0
fma(+x, -0, -0) -> -0

I think it's because of these ones:

fma(+0, -x, 0) -> +0

fmul(+0, -x) -> -0
fadd(+0, -0) -> +0

Removing the fadd would leave it at -0. This is what alive2 comes up with:

Processing fma.txt..                          
                                              
----------------------------------------      
  %ys = fma half %x, half %y, half 0.000000   
  ret half %ys                                
=>                                            
  %ys = fmul half %x, %y                      
  ret half %ys                                
                                              
ERROR: Value mismatch for half %ys            
                                              
Example:                                      
half %x = #xf461 (-17936)                     
half %y = #x0000 (+0.0)                       
Source value: #x0000 (+0.0)                   
Target value: #x8000 (-0.0)

I think it's because of these ones:

fma(+0, -x, 0) -> +0

fmul(+0, -x) -> -0
fadd(+0, -0) -> +0

Removing the fadd would leave it at -0.

Right - we need 'nsz' to fold adding +0.0 (I think this patch is correct as-is).

We can also always fold fma(x, y, -0.0) --> fmul x, y because (this is in InstSimplify):
fadd X, -0 ==> X

But I can't get the online version of Alive2 to confirm that (time-out). Local version should verify it?
If we don't have tests for the add of -0.0 variant, it would be good to add them for the likely follow-on patch.

llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
2385–2387

Can reduce this to something like:

return BinaryOperator::CreateFMulFMF(Src0, Src1, II);
dmgreen updated this revision to Diff 274444.Jun 30 2020, 6:19 AM

Thanks for the extra info. I ran things overnight and it said this for -0.0

Processing fma.txt..

----------------------------------------
  %ys = fma half %x, half %y, half -0.000000
  ret half %ys
=>
  %ys = fmul half %x, %y
  ret half %ys

Done: 1
Transformation seems to be correct!

real    136m22.357s
user    117m17.118s
sys     0m7.495s

Which looks good. I've extended the pattern to match and added some tests.

Unfortunately still this for +0.0:

Processing fma.txt..

----------------------------------------
  %ys = fma nsz half %x, half %y, half 0.000000
  ret half %ys
=>
  %ys = fmul half %x, %y
  ret half %ys

ERROR: SMT Error: smt tactic failed to show goal to be sat/unsat memout

real    152m39.828s
user    147m53.980s
sys     0m6.067s

But I'm fairly confident it's OK given nsz and the result for -0.0.

spatel accepted this revision.Jun 30 2020, 7:36 AM

LGTM

This revision is now accepted and ready to land.Jun 30 2020, 7:36 AM
This revision was automatically updated to reflect the committed changes.