Revert of https://github.com/llvm/llvm-project/commit/bfa32573bf2d0ab587f9a5d933ea2144a382cf3c
The vectorization introduced by the commit causes a miscompile.
This is the input LLVM IR: https://gist.github.com/cheshire/9dff84d8fbb83278736278854c746d8d
This is IR with the mentioned commit reverted: https://gist.github.com/cheshire/fb06443834735c2fa04fa0eacb288606 (produces correct results)
This is IR with ToT LLVM: https://gist.github.com/cheshire/d0b4195b1be344f52d77b98c70b5241d
If just a diff in IR is not enough, the IR dump is actually runnable.
To run it, compile a standalone (no dependencies, only a single file) driver tool at https://github.com/tensorflow/tensorflow/blob/master/tensorflow/compiler/xla/tools/driver.cc and link with good/bad IR version above. Then run it with a single argument, containing a filename with the buffer assignment description (contents at https://gist.github.com/cheshire/f929aa118978f4bcad03a3da11209d27).
With the vectorization reverted, the driver produces this output:
Output: ( 1.15675, 1.04532, 1.1549, 1.04836, 0.947364, 1.04667, 1.04491, 0.944249, 1.04323, 0.94699, 0.855764, 0.945469, 1.1536, 1.04247, 1.15175, 1.0455, 0.944784, 1.04382 , 1.15675, 1.04532, 1.1549, 1.04836, 0.947364, 1.04667, 1.04491, 0.944249, 1.04323, 0.94699, 0.855764, 0.945469, 1.1536, 1.04247, 1.15175, 1.0455, 0.944784, 1.04382 )
Without the revert, the output is different:
Output: ( 1.15675, 1.04532, 1.1549, 1.04836, 0.947364, 1.04667, 1.04491, 0.944249, 1.04323, 0.94699, 0.855764, 0.945469, 1.1536, 1.04247, 1.15175, 1.0455, 0.944784, 1.04382 , 1.15675, 1.10195, 1.1549, 1.04836, 0.947364, 1.04667, 1.04491, 0.899154, 1.04323, 0.94699, 0.855764, 0.945469, 1.1536, 1.09595, 1.15175, 1.0455, 0.944784, 1.04382 )
(note the 8th number in the second row)