Meta: This is a series of patches that improve variable location coverage on aarch64 when using the instruction referencing model, by tracking stack spill slots and recovering instruction numbers when optimisations happen. My aim here is to make the new model more useful / accessible for the other frequently-used targets in LLVM so that it's easier to adopt.
In my experiments so far, a stage2reldebinfo build of clang has 5% more PC varible location coverage with instruction referencing (57% -> 62%), some of which will be additional entry values. Either way it's a good improvement. Of the variable locations that are dropped by instruction referencing but not by DBG_VALUEs, they're largely due to the instruction scheduler passes shuffling DBG_PHI instructions to refer to the wrong value definition. This should be easily solvable, but I haven't got around to it yet.
This patch:
Late in SelectionDAG we join up instruction numbers with their defining instructions, if it couldn't be done during the main part of SelectionDAG. One exception is function arguments, where we have to point a DBG_PHI instruction at the incoming live register, as they don't have a defining instruction. This patch adds another exception, for constant physregs, like aarch64 has.
Technically, constant physreg values do have a defining instruction, and we could represent them like this:
%0 = COPY $wzr, debug-instr-number 1 DBG_INSTR_REF 1, 0
However the copy is susceptible to being optimised out or folded through trivial def rematerialisation. Instead, because we have determined the final value of the variable (i.e., the constant physreg) we can use a DBG_PHI:
DBG_PHI $wzr, 1 DBG_INSTR_REF 1, 0
After which the location cannot ever be optimised away.
It may seem wasteful to use two instructions where we could use a single DBG_VALUE, however the whole point of instruction referencing is to decouple the identification of values from the specification of where variable location ranges start.