The fold (fneg (fmul x, c)) -> (fmul x, -c) duplicates the fmul if the fmul has multiple uses. If fneg is free, this is even worse, since we didn't actually gain anything by doing so. Check hasOneUse or fneg not being free before doing this fold.
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
I think the only in-tree target that uses isFNegFree is R600; should I try to come up with an R600 test case? I've never worked on that backend.
Yes, it should be effected. It should be easy to copy one of the tests for this (probably fmul.ll is simplest)
I've updated one of the R600 tests to check that it doesn't duplicate the fmul-by-constant when it has one negated use and one regular use.
I would say the hasOneUse() check makes sense even in the TLI.isFNegFree() == false case (it's not TLI.NegIsWayMoreExpensiveThanMul() after all).
I'd totally agree with you, but I felt this was less invasive and risky for the moment.
Mainly, I noticed that some other folds (e.g. FMA combining) might not have been designed with an unfolded fneg in mind, so this change made independently (without looking at other folds and verifying things are okay) might worsen code on other platforms. Since I hadn't done that sort of performance verification for anything other than a GPU with neg-mods on inputs, I'm not that confident with that aggressive a change.
... plus, I think the isNegatableForFree code path already does something like this? So I kinda assumed that this fold existed for a reason and didn't want to muck with it too much.
But like, if you want to investigate making it entirely conditional on hasOneUse, that sounds totally okay!