[MLIR][OpenMP] Fix for nested parallel regions
Fix contains the following changes,
- Don't set the insertion point in the body callback.
- Save the continuation IP in a stack and set the branch to continuation IP at the terminator.
Note: This is required for supporting the master Operation (https://reviews.llvm.org/D87247).
Nit: SmallVector is re-exported, no need to prefix it with llvm namespace.