This patch enables folding following expressions under -ffast-math flag:
pow(a, b) * a -> pow(a, b+1)
(1/a) * pow(a, b) -> pow(a, b-1)
pow(a, b) * pow(c, b) -> pow(a*c, b)
pow(a, b) * pow(a, c) -> pow(a, b+c)
Motivation: https://bugs.llvm.org/show_bug.cgi?id=35595.
Details
- Reviewers
hfinkel efriedma Quolyk lebedev.ri
Diff Detail
- Repository
- rL LLVM
- Build Status
Buildable 27662 Build 27661: arc lint + arc unit
Event Timeline
I'm slightly worried about all this bunch of missing instcombines added, as InstCombine is already really really slow.
That said, this one is probably one we really want. I skimmed the code quickly and I think it's correct, but please wait for somebody else to take a look
Why is this patch WIP? The mentioned case of 'pow(a, x) * a * a * a * a -> pow(a, x+4)' is handled already by this patch.
lib/Transforms/InstCombine/InstCombineMulDivRem.cpp | ||
---|---|---|
377–380 | You mentioned that you see warnings without explicitly setting these to nullptr in Xcode. I use Xcode as an IDE too, but I don't see any warnings like that when I remove the nullptrs. If this is really a problem, then you must be getting hundreds of these warnings for existing code that does not initialize things like this? I'd prefer not to bloat the code unnecessarily. | |
385 | Need to handle commuted versions too (please add a test): define double @pow_ab_x_a_fast_commute(double %a, double %b) { %c = fdiv double 1.0, %a ; defeat complexity-based canonicalization of operands %p = call fast double @llvm.pow.f64(double %a, double %b) %mul = fmul fast double %c, %p ret double %mul } | |
386–387 | Here and below: use m_Specific instead of the trailing check for equality? | |
395–396 | Please use variable names that match the formulas in the code comments for better readability. |
pow(a, x) * a * a * a * a emits to
define double @pow_ab_x_aaaa_fast(double %a, double %x) { %1 = call fast double @llvm.pow.f64(double %a, double %x) %2 = fmul fast double %a, %a %3 = fmul fast double %2, %2 %mul4 = fmul fast double %3, %1 ret double %mul4 }
I don't see obvious ways to fold these instructions. I Would appreciate if somebody could help me with this.
This looks like it went through -reassociate first? I was checking with a straight IR translation, so we're just multiplying by 'a' over and over. This would require more complex logic, so that's a different patch (and I'm not sure where it would belong).
lib/Transforms/InstCombine/InstCombineMulDivRem.cpp | ||
---|---|---|
385 | This comment was marked 'Done', but I don't see code to account for this or the test that I suggested. |
lib/Transforms/InstCombine/InstCombineMulDivRem.cpp | ||
---|---|---|
385 | My bad, I thought fdiv_pow_ab_a test would be enough. |
Update tests. Apply patch only for pow multiplications. Pow divisions will be considered in different patch.
This patch is trying to do too many things at once. Please split only the 1st transform (pow(X, Y) * X -> pow(X, Y+1)) into its own patch, and let's continue looking at that alone as a 1st step. You should consider (and include tests for) these variations:
- Commuted operands for fmul.
- Extra uses of the pow result.
- Vector types.
- Additional FMF.
We don't necessarily need tests for every permutation of those, but there needs to be more coverage than what we see here currently.
test/Transforms/InstCombine/fmul-pow.ll | ||
---|---|---|
28–29 | This is a misleading test name. This isn't the commuted version of the previous test - the fmul does not have a common operand with the pow(). I think you want something like this: declare double @call_f64(double) define double @pow_ab_a_reassoc(double %p, double %b) { %a = call double @call_f64(double %p) ; thwart complexity-based canonicalization %pow = call double @llvm.pow.f64(double %a, double %b) %mul = fmul reassoc double %a, %pow ret double %mul } |
The folds shown here have all been added now, so this patch can be abandoned:
7736c1936a93d7
914576c1f0b669
61af2ab68142729
072b03c4714ea4
You mentioned that you see warnings without explicitly setting these to nullptr in Xcode. I use Xcode as an IDE too, but I don't see any warnings like that when I remove the nullptrs.
If this is really a problem, then you must be getting hundreds of these warnings for existing code that does not initialize things like this? I'd prefer not to bloat the code unnecessarily.