This is an archive of the discontinued LLVM Phabricator instance.

[x86] Fix PR34377 by disabling cmov conversion when we relied on it performing a zext of a register.
ClosedPublic

Authored by chandlerc on Sep 5 2017, 10:45 PM.

Details

Summary

On the PR there is discussion of how to more effectively handle this,
but this patch prevents us from miscompiling code.

Diff Detail

Repository
rL LLVM

Event Timeline

chandlerc created this revision.Sep 5 2017, 10:45 PM
This revision is now accepted and ready to land.Sep 5 2017, 10:49 PM
This revision was automatically updated to reflect the committed changes.
aaboud added inline comments.
llvm/trunk/lib/Target/X86/X86CmovConversion.cpp
297

Thanks Chandler for the quick fix.
However, we can reduce the restriction to only the case where:

  1. We are compiling for 64bit
  2. The CMOV destination is 32bit.

All other cases has no issue with the CMOV behavior.

Do you agree?

chandlerc added inline comments.Sep 6 2017, 1:59 AM
llvm/trunk/lib/Target/X86/X86CmovConversion.cpp
297

Hmm. Probably, because that's where zext is present.

However, aren't those the same conditions in which SUBREG_TO_REG will be introduced? If this is an assertion of zext-ing behavior, it can only show up due to there being zext-ing behavior of cmov itself, and that seems to be the same set of restrictions you outline.

Not sure how much more time we should spend on enhanceing the dodge of a miscompile vs. the perhaps more interesting work to handle this case elegantly and effectively.

aaboud added inline comments.Sep 6 2017, 2:10 AM
llvm/trunk/lib/Target/X86/X86CmovConversion.cpp
297

However, aren't those the same conditions in which SUBREG_TO_REG will be introduced? If this is an assertion of zext-ing behavior, it can only show up due to there being zext-ing behavior of cmov itself, and that seems to be the same set of restrictions you outline.

I think you are right, I was thinking about (CMOV16rr + zextTo32), but in such case the zext will not be removed, and we will not see the SUBREG_TO_REG. So, this restriction is good enough.

Not sure how much more time we should spend on enhanceing the dodge of a miscompile vs. the perhaps more interesting work to handle this case elegantly and effectively.

Sure, I did not mean that we need to spend any more effort immediately.
The top priority is to make sure the pass has no functionality issue, which you did already.

Thinking forward, I want to be sure what more we can do.
I also prepared a patch that add the MOVrr instruction, as Dave suggested, but I need to run performance measurements before I suggest that direction.

For now, I think we are fine with this solution, at least till we see a real performance issue.