Remove the requirement, that when performing accumulator elimination, all other cases must return the same dynamic constant. We can do this by initializing the accumulator with the identity value of the accumulation operation, and inserting an additional operation before any return.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
What happens if there are multiple tail calls in a function using different accumulators? Something like the following:
int f(int a) { if (!a) return 0; if (a & 1) { return f(a-1)+1; } return f(a-1)*2; }
llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp | ||
---|---|---|
797 | I don't see any tests involving an accumulator and a select? |
What happens if there are multiple tail calls in a function using different accumulators?
Currently on trunk, and after this patch, we only eliminate the first accumulating call we find and ignore any potential others. I think in the future it would be pretty easy to allow multiple accumulators of the same operation (eg. mul mul). In cases like the one you posted with different accumulator operations (eg. mul add), it is a lot more tricky (and I'm not sure possible) because the operations happen in reverse order.
Currently on trunk, and after this patch, we only eliminate the first accumulating call we find and ignore any potential others.
Well, on trunk, we fail out of tail recursion elimination completely, I think. But yes, makes sense.
I'd like to see a testcase for this.
In cases like the one you posted with different accumulator operations (eg. mul add), it is a lot more tricky (and I'm not sure possible) because of the operations happen in reverse order.
In general, with multiple arbitrary accumulator operations, it's impossible, yes: you can't reassociate the operations.
gcc does actually manage to do tail call elimination on my testcase. Roughly, you can distribute the multiplication over the addition. It doesn't seem very important, though.
I added your example as a test case. It is worth noting that even at -O1 clang hoists and merges the recursive calls and then branches to the correct accumulator, which would prevent either from getting removed. Also, the multiply by 2 gets converted to a shift, which would also prevent that path from getting removed. I'm going to take a look at creating another patch after this to convert shifts back to multiplies if it would enable an elimination.
I'm mostly concerned that we know what the limitations are, and that they apply correctly. There will always be some unimplemented transforms.
LGTM. I'll merge this for you
I don't see any tests involving an accumulator and a select?