This is an archive of the discontinued LLVM Phabricator instance.

[X86] In combineLoopSADPattern, pad result with zeros and use full size add instead of using a smaller add and inserting.
ClosedPublic

Authored by craig.topper on Sep 4 2017, 4:45 PM.

Details

Summary

In some cases the result psadbw is smaller than the type of the add that started the match. Currently in these cases we are using a smaller add and inserting the result.

If we instead combine the psadbw with zeros and use the full size add we can take advantage of implicit zeroing we get if we emit a narrower move before the add.

In a future patch, I want to make isel aware that the psadbw itself already zeroed the upper bits and remove the move entirely.

Diff Detail