This patch demonstrates an optimization LLVM is currently missing. It is
related to elimination of checks for which we know exact symbolic exit
count via unrolling. Imagine the loop where we have an exit with known
exit count:
do { ... if (cond()) break; // if iteration № EC is reached, we must break here. ... } while (true);
Here is what happens if we unroll it with a factor of UF
do { ... if (cond()) break; // Corresponds to iterations №№ N * UF + 0 in the initial loop. ... if (cond()) break; // Corresponds to iterations №№ N * UF + 1 in the initial loop. ... if (cond()) break; // Corresponds to iterations №№ N * UF + 2 in the initial loop. <...> if (cond()) break; // Corresponds to iterations №№ N * UF + UF - 1 in the initial loop. ... } while (true);
Important fact: let Rem be EC urem UF. Then, if we know its value (for example, Rem = 2),
then we know that only the exit in the unrolled loop which corresponds to N * UF + 2 may
possibly exit, and no other exit corresponds to the exiting iteration of the original loop, so
they never exit the loop. So we can remove UF - 1 checks leaving only one of them. So overall,
we save up to EC / UF * (UF - 1) checks over all loop iterations.
The problem is that we do not actually know value of Rem statically in most cases. But we always
know it in runtime if we know the exit count and selected the UF. We can insert a preloop which
will make sure that we enter the unrolled version of the loop with the iteration number corresponding
to Rem (of the original loop). So we always know that only the first cloned check in the unrolled
loop needs to be executed. The resulting solution would be:
for (i = 0; i < Rem; i++) { ... if (cond()) goto exit; ... } do { ... if (cond()) break; // Corresponds to iterations №№ N * UF + Rem + 0 in the initial loop. ... if (false) break; // Corresponds to iterations №№ N * UF + Rem + 1 in the initial loop and may never happen. ... if (false) break; // Corresponds to iterations №№ N * UF + Rem + 2 in the initial loop and may never happen. <...> if (false) break; // Corresponds to iterations №№ N * UF + Rem + UF - 1 in the initial loop and may never happen. ... } while (true); exit: ...
This patch adds implementation for this pass for experimental use.
clang-format: please reformat the code