This is mainly to show the issue on a reduced synthetic lit test and discuss the possible solution. I'm not convinced that the one I've included in the initial version of this patch is correct one (and even if it is, it can likely be refactored better).
The issue I'm trying to fix is similar to the one fixed in https://reviews.llvm.org/D4351
Without the patch, compiler breaks the antidependece on $esi with $ecx and generates clearly incorrect code for XOR:
$esi = XOR32rr undef $ecx, undef $ecx, implicit-def dead $eflags, implicit-def $rsi
I've attempted to follow the code's logic but wasn't able to figure out the invariant it tries to uphold.
The problems I've found (and as such, possible places to attempt the fix):
- Fix for PR20020 included the following comment:
If this reg is tied and live (Classes[Reg] is set to -1), we can't change
Yet, according to the doc above Classes:
/// For live regs that are only used in one register class in a /// live range, the register class. If the register is not live, the /// corresponding value is null. If the register is live but used in /// multiple register classes, the corresponding value is -1 casted to a /// pointer.
So should it just check against null? Is it enough?
- On x64 32-bit instructions are often (although inconsistently) annotated with implicit-def of the 64-bit reg alias.
It doesn't look like this pass handles implicit defs at all (sans, possibly, near MI.getDesc().getNumOperands()).
I don't think it's safe to assume that one can freely change implicit-def of the instruction, so perhaps adding all such encountered regs to KeepRegs could be a solution? But wouldn't that be too pessimistic given the mentioned behavior of x64 backend?
In the provided test the implicit-def $rsi in
$esi = XOR32rr undef $esi, undef $esi, implicit-def dead $eflags, implicit-def $rsi
causes the following issue:
in ScanInstruction it erases all ecx entries from RegRefs, because implicit-def is also isDef.
Then, only the uses are inserted into RegRefs in the following loop (due to isUse check).
This seems to violate isNewRegClobberedByRefs assumption stated in the comments:
//... We guard against the case in which // the two-address instruction also defines NewReg, as may happen with // pre/postincrement loads. In this case, both the use and def operands are in // RegRefs because the def is inserted by PrescanInstruction and not erased // during ScanInstruction.
Which leads to the "partial" $ecx replacement in the mir output.
Would it make sense to compute a bool HasTiedOperand variable in the first loop and skip the 2nd one if we didn't see any tied operands?