fix for: https://bugs.llvm.org/show_bug.cgi?id=44733
function __kmp_determine_reduction_method has strange architecture specific optimizations to determine the reduction algorithm. For 32-bit we need at least 3 variables to avoid atomic reduction.
This patch adds reduction variables to the testcase.
see: llvm-project/openmp/runtime/src/kmp_runtime.cpp:8143
Do we need %sort-threads? It would be good to have either have both sorted, or none.