This pass reduces the logical complexity of arith ops by choosing
narrowest supported operand bitwidth. On some targets like mobile GPUs,
narrower bitwidths also bring better runtime performance.
The first batch of rewrites handles a simple case of arith.sitofp
and arith.uitofp with zero/sign-extended inputs. In future revisions,
I plan to extend it with the following:
- Propagating sign/zero-extensions through bit-pattern-preserving ops, e.g., vector transpose, broadcast, insertions/extractions.
- Handling linalg.index using the ValueBounds interface.
- Handling more arith ops.
Also add tests for cases where we have:
to check the recursive aspect