This is an archive of the discontinued LLVM Phabricator instance.

[X86][SSE] Lower scalar_to_vector(0) to zero vector
ClosedPublic

Authored by RKSimon on Jan 24 2017, 1:56 PM.

Details

Summary

Replaces an xor+movd/movq with an xorps which will be shorter in codesize, avoid an int-fpu transfer, allow modern cores to fast path the result during decode and helps other combines recognise an all-zero vector.

The only reason I can think of that we'd want to keep scalar_to_vector in this case is to help recognise the upper elts are undef but this doesn't seem to be a problem?

Diff Detail

Repository
rL LLVM

Event Timeline

RKSimon created this revision.Jan 24 2017, 1:56 PM
RKSimon updated this revision to Diff 85757.Jan 25 2017, 8:04 AM

Added v4i32 handling as well - we were treating v4i32 scalar_to_vector as legal so were missing some cases

andreadb accepted this revision.Jan 27 2017, 3:51 AM

Thanks Simon,
Looks good to me.

This revision is now accepted and ready to land.Jan 27 2017, 3:51 AM
This revision was automatically updated to reflect the committed changes.