InstCombine is a worklist-driven algorithm, which works roughly as follows:
- All instructions are initially pushed to the worklist. The initial order is (roughly) in program order / RPO.
- All newly inserted instructions get added to the worklist.
- When an instruction is folded, its users get added back to the worklist.
- When the use-count of an instruction decreases, it gets added back to the worklist.
- ...plus a bunch of other heuristics on when we should revisit instructions.
On top of the worklist algorithm, InstCombine layers an additional fix-point iteration: If any fold was performed in the previous iteration, then InstCombine will re-populate the worklist from scratch and fold the entire function again. This continues until a fix-point is reached.
In the vast majority of cases, InstCombine will reach a fix-point within a single iteration: However, a second iteration is performed to verify that this is indeed the fixpoint. We can see this in the statistics for llvm-test-suite:
"instcombine.NumOneIteration": 411380, "instcombine.NumTwoIterations": 117921, "instcombine.NumThreeIterations": 236, "instcombine.NumFourOrMoreIterations": 2,
The way to read these numbers is that in 411380 cases, InstCombine performs no folds. In 117921 cases it performs a fold and reaches the fix-point within one iteration (the second iteration verifies the fixpoint). In the remaining 238 cases, more than one iteration is needed to reach the fixpoint.
In other words, only in 0.04% of cases are additional iterations needed to reach a fixpoint. Conversely, in 22.3% of cases InstCombine performs a completely useless extra iteration to verify the fix point.
This patch proposes to remove the fixpoint iteration from InstCombine, and to always only perform a single iteration. This results in a major compile-time improvement: http://llvm-compile-time-tracker.com/compare.php?from=b7e38ff22326d7bcbd01f080dc91f47be25e703e&to=40936c7e9324ce41819483f2c02f5bbcefa292a0&stat=instructions%3Au We get a 4-5% compile-time reduction at negligible codegen impact. (These numbers include D75362, which is a non-trivial regression when taken by itself. Most of the size-text changes are also due to that patch, not this one.)
This explicitly does accept that we will not reach a fixpoint in all cases. However, this is mitigated by two factors: First, the data suggests that this happens very rarely in practice. Second, InstCombine runs many times during the optimization pipeline (8 times even without LTO), so there are many chances to recover such cases.
In order to prevent accidental optimization regressions in the future, this implements a default-enabled verify-fixpoint option, which will make sure that the fix point has indeed been reached after a single iteration. This means that tests where this is not the case need to be explicitly annotated. The actual optimization pipeline will disable this option, as failure to reach the fix point is expected to happen there (in rare cases, as described above).
Depends on D75362.
imo the InstCombinePass constructor should default to no-verify-fixpoint, but parseInstCombineOptions should by default set verify-fixpoint, since we typically call the InstCombinePass constructor from pass pipelines