This patch teaches llvm-mca how to identify register writes that implicitly zero the upper portion of a super-register.
On X86-64, a general purpose register is implemented in hardware as a 64-bit register. Quoting the Intel 64 Software Developer's Manual: "an update to the lower 32 bits of a 64 bit integer register is architecturally defined to zero extend the upper 32 bits".
Also, a write to an XMM register performed by an AVX instruction implicitly zeroes the upper 128 bits of the aliasing YMM register.
This patch adds a new method named clearsSuperRegisters to the MCInstrAnalysis interface to help identify instructions that implicitly clear the upper portion of a super-register.
The rest of the patch teaches llvm-mca how to use that new method to obtain the information, and update the register dependencies accordingly.
I compared the kernels from tests clear-super-register-1.s and clear-super-register-2.s against the output from perf on btver2.
Previously there was a large discrepancy between the estimated IPC and the measured IPC. Now the differences are mostly in the noise.
Please let me know if okay to commit.
Thanks,
Andrea