This patch lowers the SAD intrinsics to native LLVM IR. Comes with an LLVM patch (D45723).
|8427 ↗||(On Diff #142772)|
This clear isn't needed.
|8432 ↗||(On Diff #142772)|
You shouldn't need to explicitly create an ArrayRef here. It should automatically convert. And if it doesn't makeArrayRef is what you should use. It will automatically infer the uint32_t from the vector.
Size the ShuffleMask to N when it's created. Then you can use just direct assign each array entry in the loops. This will remove the need for the clear() in the later loop. It will also remove the hidden code that checks if we need to grow on every call to push_back.
You can just pass AD twice. You don't need to create an Undef value. It will get optimized later.