The problem
tl;dr- pass LiveVariables forgets to restore the MO::IsDead bit for reserved, physical registers.
The LiveVariables pass analyzes MachineInstrs to set the IsKill and IsDead bit on each MachineOperand. It first clears the IsKill and IsDead bits on every machine operand, then recomputes liveness, and finally sets the IsKill/IsDead bits appropriately.
Although the LiveVariables pass clears these bits for *all* MachineOperands, LiveVariables only computes liveness on virtual registers and non-reserved physical registers; it does *not* compute liveness on reserved physical registers, such as condition-code "flag" registers or the stack pointer. Consequently, *no* reserved, physical registers are ever marked as dead or killed. This is conservative, but it pessimizes subsequent passes.
Consequences of that problem
tl;dr- scheduling suffers.
In particular, LiveVariables preceeds the machine scheduler. The machine scheduler builds a dependence graph, including edges representing data, anti, and output dependencies among instructions which use physical registers (see ScheduleDAGInstrs::addPhysRegDeps). In order to avoid an overly-constrained scheduling graph, that routine attempts to skip output dependencies if the physical register is dead.
However, since LiveVariables has cleared the MachineOperand::IsDead bit for reserved, physical registers, the routine ScheduleDAGInstrs::addPhysRegDeps adds these output dependences even though they are not necessary.
In concrete terms, if there is a sequence of instructions where each defines a flag register (e.g., some ADD instructions set the OVERFLOW flag), then MISched considers them locked in order and is unwilling to schedule them in any other order. This sometimes forces the scheduler to emit very-far-from-optimal schedules.
Solution
Modify the LiveVariables pass so that it does not clear the MachineOperand::IsDead or MachineOperand::IsKill bits when the MachineOperand represents a def/use of a physical, reserved register. This criterion directly corresponds to the condition later in the same routine that guard HandlePhysRegUse and handlePhysRegDef.
I tested this patch on today's latest (git 72bf3d33aa1b7e50bca13eecf576011fcc327d51) and it passed regressions:
[100%] Running the LLVM regression tests Testing Time: 388.74s Expected Passes : 14991 Expected Failures : 132 Unsupported Tests : 111 [100%] Built target check-llvm Scanning dependencies of target check [100%] Built target check
In my out-of-tree target, I see significant improvement to the scheduling DAG (tall and narrow becomes short and broad).
Other references
- I posted a message about this to llvm-dev a few days ago and haven't heard back: http://lists.llvm.org/pipermail/llvm-dev/2015-November/092526.html