This adds a pattern that lowers linalg.quantized_matmul to linalg.matmul.
quantized_matmul is useful as a higher-level op, as it matches higher-level dialects such as TOSA.
Most practical codegen paths will want to distribute out the zero-point
subtractions and thereby reduce quantized_matmul to matmul. This commit
adds a pass and pattern doing that literally, generating a linalg.matmul
named op. While that's not necessary (generic linalg transforms
could [be taught to] do the same), at this point a few different groups have found (*) that
they currently depend on matmuls being named matmul ops for various
reasons, so this pattern will be useful for the time being.
(*): Tracking issue:
https://github.com/google/iree/issues/8330
Two different sets of people having run into this:
https://github.com/google/iree/issues/8149
https://github.com/google/iree/pull/8281
Independently, I also need this for other matmul-to-mmt4d.