This enables to express more complex parallel loops in the affine framework,

for example, in cases of tiling by sizes not dividing loop trip counts perfectly

or inner wavefront parallelism, among others. One can't use affine.max/min

and supply values to the nested loop bounds since the results of such

affine.max/min operations aren't valid symbols. Making them valid symbols

isn't an option since they would introduce selection trees into memref

subscript arithmetic as an unintended and undesired consequence. Also

add support for converting such loops to SCF. Drop some API that isn't used in

the core repo from AffineParallelOp since its semantics becomes ambiguous in

presence of max/min bounds. Loop normalization is currently unavailable for

such loops.

Depends On D101171