This is an archive of the discontinued LLVM Phabricator instance.

[DAGCombine] Perform the fold of A - (-B) -> A + B only when it is cheaper
AbandonedPublic

Authored by steven.zhang on Feb 25 2020, 11:57 PM.

Details

Reviewers
rampitec
RKSimon
craig.topper
spatel
dmgreen
arsenm
jsji
Group Reviewers
Restricted Project
Summary

We have the rule to fold A + B -> A - (-B) only when it is cheaper.

// fold (fadd A, (fneg B)) -> (fsub A, B)
if ((!LegalOperations || TLI.isOperationLegalOrCustom(ISD::FSUB, VT)) &&
    TLI.getNegatibleCost(N1, DAG, LegalOperations, ForCodeSize) ==
        TargetLowering::NegatibleCost::Cheaper)
  return DAG.getNode(
      ISD::FSUB, DL, VT, N0,
      TLI.getNegatedExpression(N1, DAG, LegalOperations, ForCodeSize), Flags);

But for the reverse folding(A - (-B) -> A + B), it is done as long as it is not expensive, which including that it is neutral. This patch fix this transformation that didn't have any gain.

Diff Detail

Event Timeline

steven.zhang created this revision.Feb 25 2020, 11:57 PM
steven.zhang edited the summary of this revision. (Show Details)Feb 25 2020, 11:59 PM

The test changes don't immediately stand out to me as improvements.
I'd think the fix should go in other direction.

The test changes don't immediately stand out to me as improvements.
I'd think the fix should go in other direction.

Yes, no improvement as we know that it is neutral. And I don't see the improvement case also from the test change if we do this folding. Regarding to we only do the reverse folding when it is cheaper, I want to make them consistent to avoid the confusion no matter which direction. It is confusing that sometimes we do the folding if it is neutral and sometime not. The current implementation imply that, we prefer the "ADD" to "SUB". I am not sure that it is by the design as I don't see the reason for that. Welcome for any comments.

RKSimon added inline comments.Feb 26 2020, 2:47 AM
llvm/test/CodeGen/X86/dag-fmf-cse.ll
14

Annoying - any chance you can investigate?

The test changes don't immediately stand out to me as improvements.
I'd think the fix should go in other direction.

Yes, no improvement as we know that it is neutral. And I don't see the improvement case also from the test change if we do this folding.

Regarding to we only do the reverse folding when it is cheaper,
I want to make them consistent to avoid the confusion no matter which direction.

That was my suggestion, yes. Change the other fold to be consistent with this one.

It is confusing that sometimes we do the folding if it is neutral and sometime not. The current implementation imply that, we prefer the "ADD" to "SUB".

That does sound sane to me, we certainly do have such a preference at least for integers, at least in middle-end.

I am not sure that it is by the design as I don't see the reason for that. Welcome for any comments.

Loss of commutativity by going from fadd to fsub will likely cause register pressure regressions.

The test changes don't immediately stand out to me as improvements.
I'd think the fix should go in other direction.

I want to make them consistent to avoid the confusion no matter which direction.

Loss of commutativity by going from fadd to fsub will likely cause register pressure regressions.

I'm not very active wrt fp side of things, but i would almost think we need something like this instead

// unfold (fsub A, B) -> (fadd A, (fneg B))
if ((!LegalOperations || TLI.isOperationLegalOrCustom(ISD::FADD, VT)) &&
    TLI.getNegatibleCost(N1, DAG, LegalOperations, ForCodeSize) !=
        TargetLowering::NegatibleCost::Expensive)
  return DAG.getNode(
      ISD::FADD, DL, VT, N0,
      TLI.getNegatedExpression(N1, DAG, LegalOperations, ForCodeSize), Flags);
arsenm added inline comments.Feb 26 2020, 7:16 AM
llvm/test/CodeGen/AMDGPU/fmuladd.f16.ll
198

This is a regression. The negate should have been pulled out since it folds into the user for smaller code size

The test changes don't immediately stand out to me as improvements.
I'd think the fix should go in other direction.

I want to make them consistent to avoid the confusion no matter which direction.

Loss of commutativity by going from fadd to fsub will likely cause register pressure regressions.

I'm not very active wrt fp side of things, but i would almost think we need something like this instead

// unfold (fsub A, B) -> (fadd A, (fneg B))
if ((!LegalOperations || TLI.isOperationLegalOrCustom(ISD::FADD, VT)) &&
    TLI.getNegatibleCost(N1, DAG, LegalOperations, ForCodeSize) !=
        TargetLowering::NegatibleCost::Expensive)
  return DAG.getNode(
      ISD::FADD, DL, VT, N0,
      TLI.getNegatedExpression(N1, DAG, LegalOperations, ForCodeSize), Flags);

Right, and we do that already.
Then i'm not sure there is anything to fix here..

steven.zhang abandoned this revision.Feb 26 2020, 5:35 PM

Seems that we indeed see some regression and 'add' has some benefit on the register pressure over sub as it is commutable. I will abandon this revision, Thank you for all the comments.