This is an archive of the discontinued LLVM Phabricator instance.

[x86] these aren't the undefs you're looking for (PR32176)
ClosedPublic

Authored by spatel on Mar 10 2017, 9:26 AM.

Details

Summary

x86 has undef SSE/AVX intrinsics that should represent a bogus register operand. This is not the same as LLVM's undef value which can take on multiple bit patterns.

There are better solutions / follow-ups to this discussed here:
https://bugs.llvm.org/show_bug.cgi?id=32176
...but this should prevent miscompiles with a one-line code change.

Diff Detail

Repository
rL LLVM

Event Timeline

spatel created this revision.Mar 10 2017, 9:26 AM
craig.topper edited edge metadata.Mar 10 2017, 3:42 PM

Have you ran the tests all the way through to assembly and made sure we don't regress? If we do regress, I wouldn't hold up fixing this, but we should at least have bugs for what breaks.

lib/CodeGen/CGBuiltin.cpp
7384 ↗(On Diff #91362)

Do we want to leave a TODO here to maybe do something different when we have freeze?

Have you ran the tests all the way through to assembly and made sure we don't regress? If we do regress, I wouldn't hold up fixing this, but we should at least have bugs for what breaks.

Yes - I spot-checked several of the shuffle cases, and we always catch those even at -O0. As noted in the PR, we get those in the backend or InstCombine if optimization is on.

The AVX2 gathers regress (we have an xor for the dummy register operand), and of course the undef intrinsics themselves now have an xor. If you know where those undef intrinsics are intended to be used, I can try that too. There's no problem with the pcmpeqd example in the bug report. I'll file a new PR for the gathers once this goes in and link to PR32176.

lib/CodeGen/CGBuiltin.cpp
7384 ↗(On Diff #91362)

Sure - I'll add that.

nlopes accepted this revision.Mar 11 2017, 1:26 AM
This revision is now accepted and ready to land.Mar 11 2017, 1:26 AM
This revision was automatically updated to reflect the committed changes.