This patch adds 2 methods, one for power-of-2 vectors which use tree reductions using vector ops, before a final reduction op. For non-pow-2 types it generates multiple narrow reductions and combines the values with scalar ops.
The vabs test is modified because fixing some fallback results in an unintended match.
clang-format: please reformat the code