[MLIR] [OpenMP] Add basic OpenMP parallel operation

This includes a basic implementation for the OpenMP parallel
operation without a custom pretty-printer and parser.
The if, num_threads, private, shared, first_private, last_private,
proc_bind and default clauses are included in this implementation.

Currently the reduction clause is omitted as it is more complex and
requires analysis to see if we can share implementation with the loop
dialect. The allocate clause is also omitted.

A discussion about the design of this operation can be found here:

The current OpenMP Specification can be found here:

Co-authored-by: Kiran Chandramohan <kiran.chandramohan@arm.com>

