Previously some constants were not pushed to the top of the resulting
expression tree as intended by the algorithm. We can remove the logic from
simplifyFAdd and rely on SimplifyAssociativeOrCommutative to do that.
Details
Diff Detail
Event Timeline
I can't tell if this has a greater effect than shown by the test diff. That diff actually demonstrates a basic missing canonicalization in IR - no fast-math is needed to convert fsub to fadd in that example or any of these more general cases:
define float @fmul_c1(float %x, float %y) { %m = fmul float %x, 7.000000e+00 %r = fsub float %y, %m ret float %r } define float @fdiv_c0(float %x, float %y) { %m = fdiv float 7.000000e+00, %x %r = fsub float %y, %m ret float %r } define float @fdiv_c1(float %x, float %y) { %m = fdiv float %x, 7.000000e+00 %r = fsub float %y, %m ret float %r }
We do have this transform in the backend though (pushing the implicit negation op into the constant operand). For example, if you codegen this with llc for x86, you'll see 3 "addss" instructions and no "subss". Would adding those canonicalizations to IR solve your motivating case(s)?
Or is there a larger example of a change resulting from this patch?
Thanks for your comment! I don’t have a larger example that results from this patch and adding the canonicalization that you mentioned will solve my motivation case. I can implement that and submit as a separate patch if that’s applicable for you. Could you please tell me the right place to add the canonicalization to?
Getting back to the current patch – I believe that it is also worth to be merged – at least, in terms of refactoring (having code that is intended to do the same thing in two places looks weird IMHO).
I think it would start from InstCombinerImpl::visitFSub() and be very similar to the specialization that already exists as foldFNegIntoConstant. I already drafted a patch for it while looking at this, so I can clean that up and post it (the harder part is adding a pile of tests to check FMF propagation!).
Getting back to the current patch – I believe that it is also worth to be merged – at least, in terms of refactoring (having code that is intended to do the same thing in two places looks weird IMHO).
I agree. I'm still not sure exactly what happens within FAddCombine, but this patch just removes code, so LGTM.
llvm/lib/Transforms/InstCombine/InstCombineAddSub.cpp | ||
---|---|---|
540 | typo: supper -> super |
llvm/test/Transforms/Reassociate/fast-basictest.ll | ||
---|---|---|
2 | I missed that this test is under Reassociate. It's generally wrong for a regression test to run 3 different passes, but it can be a separate cleanup patch. |
OK, please let me know if you need any help with that. If I understood you correctly, you plan to submit the patch by yourself. If so, could you please mention me in revision to keep me informed? Thanks!
llvm/lib/Transforms/InstCombine/InstCombineAddSub.cpp | ||
---|---|---|
540 | Fixed | |
llvm/test/Transforms/Reassociate/fast-basictest.ll | ||
2 | Got it, will submit a separate patch to fix this |
Looking over the regression tests, I noticed that we would miss a factoring optimization if we canonicalize fsub->fadd more often. This would potentially be inverting a fold done by the reassociation pass:
/// Recursively analyze an expression to build a list of instructions that have /// negative floating-point constant operands. The caller can then transform /// the list to create positive constants for better reassociation and CSE.
...so I'm not sure if it is worth doing that transform generally.
clang-format: please reformat the code