Add support for vectorization for linalg.generic representing element-wise ops. Those are converted to transfer_read + vector ops + transfer_write.
Also re-organize the vectorization tests to be together.
Implementation derived from @burmako's work.
I thought we had some LLVM interface already for elementwise like (ElementsMappable something from @silvas' work?)