With ordered clause specified with parameter n, the n outer loops form a
doacross loop nest. Add applyDoacrossLoop to implement the doacross loop
"init" and "fini" runtime call in OpenMP IRBuilder. Add one virtual
clause in WsLoop MLIR Op to store the doacross loop bounds info.
In addition, move the barrier runtime call in the front of "after" basic
block, and set the insertion point at the end of "after" basic block.
With this change, lowering to LLVM IR is supported when dynamic schedule
is specified and collapse value is greater than 1. Also add the test
case.
Did you consider making doacross part of an existing call like applyDynamicWorkshareLoop? What are the reason against it? If is a potential collapseLoop that loses information of the dimensionality of the original loop, did you consider adding that information to CanonicalLoopInfo such that it can be preserved?