The D45316 introduced the shouldTransformMulToShiftsAddsSubs function to check that breaking down constant multiplications into a series of shifts, adds, and subs is efficient. Unfortunately, this function does not check maximum number of steps on all paths of the algorithm. This patch fixes this bug.
Fix for PR41929.
Looking at this and the original version, some things come to mind:
a) MaxSteps needs to consider the VT of the constant for the target. I.E. for a 32 bit multiplication on a N32/N64 target the maximum number of steps is incorrect. I believe the calculation of the maximum number of steps needs to consider the case of <=i32 and >i32 && <=i64 cases for natively supported types.
I would suggest looking at starting with the maximum number of steps as equal to the number of cycles that it takes to perform a constant materialization sequence worst case, then applying a "legalization penalty" to the number of steps if the type of the operands is not natively supported.
b) This optimization can occur before type legalization, it may be worth considering restricting this optimization to after type legalization so that there is no fudge on the legalization penalty of an illegal type for some constant value (lines 771-780).
c) This optimization should be account for -Os, -Oz, as the optimization can increase code size.