[MLIR] Improve calling convention for unranked memory descriptor results.
For unranked memory descriptor results, their size is not statically known due
to the inner descriptor of dynamic rank. To return such descriptors from
functions generally requires dynamic memory allocation which involves calls to
malloc and which can be expensive. To circumvent this problem, we allocate
allocate buffers on the stack that are big enough to hold the inner descriptors
up to some supported rank (max-unranked-desc-buffer-rank). If the unranked
descriptor does not exceed this rank, we can always copy it to stack-allocated
memory and avoid heap allocation entirely. Otherwise, if the rank of the
returned buffer is too big for the pre-allocated buffer, we fall back to
dynamic memory allocation. This is an optimization similar to the implementation
of an llvm::SmallVector.
Doc comment here.