This new approach is a replacement for D37195.
Motivation:
InstCombine algorithm runs with high complexity (as it keeps running till there is no change to the code), thus it requires that each piece inside the outer loop, have a very small complexity (preferable O(1)).
Problem
There are instruction patterns that costs more than O(1) to identify and modify, thus it cannot be added to the InstCombine algorithm.
Solution
AggressiveInstCombine pass, which will run separately from InstCombine pass, and can perform expression pattern optimizations, which might take each up to O(n) complexity, where n is number of instructions in the function.
I introduced in this patch:
- The AggressiveInstCombiner class is the main pass that will run all expression pattern optimizations.
- The TruncInstCombine class, first added expression pattern optimization.
TruncInstCombine currently supports only simple instructions such as: add, sub, and, or, xor and mul.
The main difference between this and the canEvaluateTruncated from InstCombine is that this one supports:
- Instructions with multi-use, as long as all are dominated by the truncated instruction.
- Truncating to different width than the original trunc instruction requires. (this is useful if we reduce the expression width, even if we are not eliminating any instruction).
Next I will add to this TruncInstCombine class the following support, each in a separate patch:
- select, shufflevector, extractelement, insertelement
- udiv, urem
- shl, lshr, ashr
- phi node (and loop handling)