Use a modeling similar to SCF ParallelOp to support arbitrary parallel
reductions. The two main differences are: (1) reductions are named and declared
beforehand similarly to functions using a special op that provides the neutral
element, the reduction code and optionally the atomic reduction code; (2)
reductions go through memory instead because this is closer to the OpenMP
semantics.
See https://llvm.discourse.group/t/rfc-openmp-reduction-support/3367.
clang-tidy: error: 'mlir/Dialect/OpenMP/OpenMPTypeInterfaces.h.inc' file not found [clang-diagnostic-error]
not useful
clang-tidy: error: 'mlir/Dialect/OpenMP/OpenMPTypeInterfaces.h.inc' file not found [clang-diagnostic-error]
not useful