This is an archive of the discontinued LLVM Phabricator instance.

Fix confusion over x86_64 CMOV semantics in order to avoid unnecessary zero extensions
ClosedPublic

Authored by DavidKreitzer on Jul 28 2016, 2:37 PM.

Details

Summary

We noticed this issue while working on something unrelated.

The check for X86ISD::CMOV was added long ago in this revision:

r81814 | djg | 2009-09-14 17:14:11 -0700 (Mon, 14 Sep 2009) | 3 lines

On x86-64, the 32-bit cmov doesn't actually clear the high 32-bit of
its result if the condition is false.


But that statement is incorrect. The 32-bit CMOVs do clear the high 32 bits of the result regardless of whether the condition is true or false. That is easily verifiable, and other compilers including MSVC and the Intel compiler take advantage of this semantic to avoid unnecessary 32-bit --> 64-bit zero extends.

The latest architecture manuals from both Intel and AMD support this change, though I wonder if an earlier documentation bug caused the confusion. At any rate, the latest AMD manual says, "In 64-bit mode, CMOVcc with a 32-bit operand size will clear the upper 32 bits of the destination register even
if the condition is false." And the latest Intel manual describes the behavior in pseudo-code as

Operation

temp ← SRC

IF condition TRUE

THEN
  DEST ← temp;
FI;

ELSE

IF (OperandSize = 32 and IA-32e mode active)
  THEN
    DEST[63:32] ← 0;
FI;

FI;

Not surprisingly, there was no significant performance impact from this change (on cpu2000, et al).

Diff Detail

Event Timeline

DavidKreitzer retitled this revision from to Fix confusion over x86_64 CMOV semantics in order to avoid unnecessary zero extensions.
DavidKreitzer updated this object.
DavidKreitzer added reviewers: sunfish, mkuper, aaboud.
DavidKreitzer added a subscriber: llvm-commits.
mkuper accepted this revision.Jul 28 2016, 2:45 PM
mkuper edited edge metadata.

LGTM

This revision is now accepted and ready to land.Jul 28 2016, 2:45 PM
This revision was automatically updated to reflect the committed changes.