Context
We are planning to introduce a series of patches that aim to extend LV for *explicit* outer loop vectorization support, using the VPlan infrastructure ---- starting from trivial outer loops and then build up functionality to deal with more complex outer loops. This can be further extended to perform outer loop auto-vectorization by additional work on legality and cost modeling, but for the sake of simplicity, that aspect is outside of the scope of this patch series. Further context in the RFC: http://lists.llvm.org/pipermail/llvm-dev/2017-December/119523.html
This patch is the first installment of Patch Series #1. Patch Series #1 focuses on bringing to LV the initial functionality to detect, construct VPlans and generate code out of them for a well limited set of trivial outer loops that have been annotated to be explicitly vectorized. In particular, Patch Series #1 will add support for explicit vectorization of trivial outer loops where:
- inner loops are uniform, i.e., they are trivially lock-step among all vector elements and,
- there are no branches inside the outer loop body other than a single loop latch and a zero-trip-test per inner loop.
The following loop is an example of a supported outer loop:
#pragma clang loop vectorize(enable) vectorize_width(4) for (i=0;i<N;i++){ // no branches here for (j=0;j<loop_invariant_value; j++){ // no branches here } // no branches here }
In order to achieve these goals, Patch Series #1 will introduce the following functionality:
- Restrictive detection of supported loop nests, including inner loop uniformity check analysis.
- Initial VPlan construction for outer loops.
- VPlan-based vector code generation for outer loops.
Patch Series #1. Sub-patch #1
This patch adds support for detecting outer loops with irreducible control flow in LV. Current detection uses SCCs and only works for innermost loops. This patch adds a utility function that works on any CFG, given its RPO traversal and its LoopInfoBase. This function is a generalization of ‘isIrreducibleCFG’ from lib/CodeGen/ShrinkWrap.cpp. The code in lib/CodeGen/ShrinkWrap.cpp is also updated to use the new generic utility function.
I would appreciate your feedback.
Thanks,
Diego