This fix addresses the failures reported in pr27681, pr27580 and pr27804, caused by a bug in CriticalAntiDepBreaker.cpp, which increased in exposure when I recently enabled the post-RA-scheduler for more X86 cpus.
At the heart of the issue is that MCSuperRegIterator does not exhaust all registers in the underlying container.
For example if starting with %CL it will encounter %CL, %CX, %ECX and %RCX, but not %CH. This isn't a direct problem,
but it leads to some subtle differences in the various register properties maintained by this optimization,
where %CH stands out as different.
In the case of pr27681, all %RCX subregs other than %CH get conservatively added to the KeepRegs set when we see
a %CL def tied to a use in PreScanInstruction. When we later see an %ECX def in ScanInstruction's def processing,
we skip state updates because of the %ECX presence in KeepRegs. This leaves %CH in an incorrect "free" state.
My fix is to ensure %CH gets added to KeepRegs as well, and to avoid choosing a KeepRegs register
when looking for a free register.
Any use of MCSuperRegIterator or MCRegAliasIterator is prone to this subtle issue. I briefly looked at the other
uses in this optimization, and it was not clear if they could be problematic, so I chose not to touch them.
I am surprised that we need to guard KeepRegs.reset() with an if here. But well it preserves some existing behavior so if it is somehow necessary to have the if here keep it.