Current implementation of SCEVExpander demonstrates a very naive behavior when
it deals with power calculation. For example, a SCEV for x^8 looks like
(x * x * x * x * x * x * x * x)
If we try to expand it, it generates a very straightforward sequence of muls, like:
x2 = mul x, x x3 = mul x2, x x4 = mul x3, x ... x8 = mul x7, x
This is a non-efficient way of doing that. A better way is to generate a sequence of
binary power calculation. In this case the expanded calculation will look like:
x2 = mul x, x x4 = mul x2, x2 x8 = mul x4, x4
In some cases the code size reduction for such SCEVs is dramatic. If we had a loop:
x = a; for (int i = 0; i < 3; i++) x = x * x;
And this loop have been fully unrolled, we have something like:
x = a; x2 = x * x; x4 = x2 * x2; x8 = x4 * x4;
The SCEV for x8 is the same as in example above, and if we for some reason
want to expand it, we will generate naively 7 multiplications instead of 3.
The BinPow expansion algorithm here allows to keep code size reasonable.
This patch teaches SCEV Expander to generate a sequence of BinPow multiplications
if we have repeating arguments in SCEVMulExpressions.
How challenging would it be to enhance SCEVMulExpr to represent repeated multiplies by the same value as a pair (value, exponent) rather than as a list of the same value being repeated 'exponent' times?
This patch has addressed not having to expand a ^ (2 ^ N) at runtime, but we still have an exponential number of terms in the OpsAndLoops list, which should quite negatively affect compile time.