If additional static type information can be deduced from a insert_slice's size operands, insert an explicit cast of the op's source operand.
This enables other canonicalization patterns that are matching for tensor_cast ops such as ForOpTensorCastFolder in SCF.
Note, all OpWithOffsetSizesAndStrides may take only a subset of
leading values (and auto complete the remaining ones to the canonical expected offsets/sizes/strides); as a consequence this may overflow.
Sending a bugfix for this but FYI there may be other occurrences.
One easy way to track them down could be to force the verifier to specify everyhing and run tests (but that will catch false rank-reducing positives; still it should help).
@springerm if you get to it first.