The vector.fma operation is portable enough across targets that we do not want
to keep it wrapped under vector.outerproduct and llvm.intrin.fmuladd.
This revision lifts the op into the vector dialect and implements the lowering to LLVM by using two patterns:
- a pattern that lowers from n-D to (n-1)-D by unrolling when n > 2
- a pattern that converts from 1-D to the proper LLVM representation
typo: that operate (plural)
but more in general, can you describe the semantics in a bit more detail than this? In particular, the lowering part to llvm could be mentioned at one point as motivation to have this, but it seems a bit strange to mention that in the very first sentence already.