Tranpose intrinsic performs the transpose matrix operation for arrays
of rank 2. The intrinsic is lowered to a runtime call.
This is part of the upstreaming effort from the fir-dev branch in [1].
[1] https://github.com/flang-compiler/f18-llvm-project
Co-authored-by: Valentin Clement <clementval@gmail.com>
Co-authored-by: Jean Perier <jperier@nvidia.com>