We noticed this issue while working on something unrelated.
The check for X86ISD::CMOV was added long ago in this revision:
r81814 | djg | 2009-09-14 17:14:11 -0700 (Mon, 14 Sep 2009) | 3 lines
On x86-64, the 32-bit cmov doesn't actually clear the high 32-bit of
its result if the condition is false.
But that statement is incorrect. The 32-bit CMOVs do clear the high 32 bits of the result regardless of whether the condition is true or false. That is easily verifiable, and other compilers including MSVC and the Intel compiler take advantage of this semantic to avoid unnecessary 32-bit --> 64-bit zero extends.
The latest architecture manuals from both Intel and AMD support this change, though I wonder if an earlier documentation bug caused the confusion. At any rate, the latest AMD manual says, "In 64-bit mode, CMOVcc with a 32-bit operand size will clear the upper 32 bits of the destination register even
if the condition is false." And the latest Intel manual describes the behavior in pseudo-code as
Operation
temp ← SRC
IF condition TRUE
THEN DEST ← temp; FI;
ELSE
IF (OperandSize = 32 and IA-32e mode active) THEN DEST[63:32] ← 0; FI;
FI;
Not surprisingly, there was no significant performance impact from this change (on cpu2000, et al).