This is an archive of the discontinued LLVM Phabricator instance.

[DebugInfo][InstrRef][3/4] Produce DBG_INSTR_REFs for all* variable locations
ClosedPublic

Authored by jmorse on Oct 6 2020, 6:36 AM.

Details

Summary

This patch emits DBG_INSTR_REFs for the final two flavours of variable locations that couldn't be handled before: copies, and inter-block VRegs. There are still some variable locations that are represented by DBG_VALUE instructions, which are:

  • Constants,
  • Frame-index references,
  • Physical registers.

The first two are independent of any optimisations that occur, and don't really need to change. DBG_VALUEs of physregs are still emitted for function parameters, and I haven't been brave enough to go anywhere near that situation. Again, such DBG_VALUEs aren't actually affected by optimisations due to their placement and their physreg operands never being optimised.

For variable locations that refer to values defined in different blocks, it's slightly tricky: such values are given cross-block virtual registers before instruction selection begins, but there isn't necessarily an instruction defining them available: I don't believe there's a guarantee on the order in which blocks are instruction-selected and emitted. To get around this: if we can't immediately identify the defining instruction of a VReg, emit a "half done" DBG_INSTR_REF, where the first operand identifies the VReg it's associated with. Then at a later point in ISel, after all blocks are emitted to MIR, find the defining instruction and mutate the DBG_INSTR_REF to refer to it. This is a hack; but either an ordering guarantee has to be put on block emission (which I imagine could involve every CodeGen test needing updating) or some kind of intermediate form is needed until all the information is available.

Copies: this is one of the things in the original RFC that I complained about: having the same value describable in multiple locations is unpleasent. I would like this instruction referencing work to put us in a position where we never refer to copy instructions at all. They wouldn't work well with the InstrRefBased LiveDebugValues implementation anyway, as copies don't define any values, they only move them.

The solution to this is the salvageCopySSA method added in MachineFunction. This interprets:

  • COPYs
  • SUBREG_TO_REG
  • Anything identifiable by the targets isCopyInstr method

And follows chains of copies back to the defining instruction that they read from. This relies on an assumption: that any copies from physical registers originate from a preceeding instruction in the same block, or is a parameter. Take a look at the tests added in instr-ref-selectiondag.ll: due to the way SelectionDAG operates, whenever an instruction defs a physical register we copy it to a VReg; salvageCopySSA walks backwards from the COPY-from-rax to find the defining instruction. The exception is arguments: those never have a defining instruction, so we have to drop a DBG_PHI at the entry block. (Right now we drop one DBG_PHI for every variable that reads an Argument; I'll revise this with a cache in a future revision of this patch).

We might end up having to have a similar "Salvage copy" operation for after regalloc, but we'll cross that bridge later.

Diff Detail

Event Timeline

jmorse created this revision.Oct 6 2020, 6:36 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 6 2020, 6:36 AM
jmorse updated this revision to Diff 300943.Oct 27 2020, 4:30 AM

(Rebase due to me fiddling with the signature of substituteDebugValuesForInst on master).

Orlando added a subscriber: Orlando.Jul 1 2021, 5:55 AM

I've taken a stab at reviewing this. I especially appreciate and enjoy the detailed comments and patch description. I've left some nits and questions inline but otherwise it looks good IMO.

llvm/include/llvm/CodeGen/MachineFunction.h
510

Does this need to return an Optional? As far as I can tell salvageCopySSA never returns None, and the only callsite doesn't check whether the returned result has a value or not.

llvm/lib/CodeGen/MachineFunction.cpp
1039–1040

I'm struggling to follow this part. When you say "only remember truncations or offsets" am I right in thinking that setting SubReg = 0 causes some kind of mismatch that is picked up and handled by LiveDebugValues/InstrRefBasedImpl in D88894?

1133–1140

Is it worth putting the create-a-dbg-phi code into a function that is shared between these patches? No strong opinion on this.

1139

nit: I think you can do Builder.addReg(State.first, RegState::Debug) on line 1135 instead.

llvm/lib/CodeGen/SelectionDAG/InstrEmitter.cpp
776

SGTM. I make a comment later about moving the SelectionDAGISel half-done-dbg-instr-fixup code into its own function. If you do that you could change and patch it up later in SelectionDAG to and patch it up later in <function-name>, which might make it easier to quickly find the relevant code in the future.

llvm/lib/CodeGen/SelectionDAG/InstrEmitter.h
121–128

It might be useful to split the factoring-out of this DBG_VALUE building code from EmitDbgValue into a separate NFC patch IMO, but it's a probably not too important.

llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
646–700

I think it would be good to move this fixup-half-done-instr-refs code into its own function. It would reduce clutter in runOnMachineFunction and you could then also add a comment to the EmitHalfDoneInstrRef lambda mentioning it so that it's easy to find from there. wdyt?

llvm/test/DebugInfo/X86/instr-ref-selectiondag.ll
83

Sorry if this is mentioned elsewhere - why are you explicitly not testing the location (position?) of the isntr-refs here?

jmorse added inline comments.Jul 1 2021, 9:35 AM
llvm/include/llvm/CodeGen/MachineFunction.h
510

/me scratches head. I guess I prototyped this with it returning an optional, then revised it to assert if any value wasn't available.

I'll update this to not return an optional.

llvm/lib/CodeGen/MachineFunction.cpp
1039–1040

Probably because it's a cop-out -- if I compile this:

void ext(int b);
int foo(char a)
{
    ext(a);
    return 0;
}

I get DBG_VALUE $edi for b rather than DBG_VALUE $al, which has encouraged me to be fairly lax about accurate widening for parameters. Off the top of my head, I can't think of any reason to omit the subregister information, I'll run some experiments to see what effect it has.

jmorse updated this revision to Diff 356540.Jul 5 2021, 11:43 AM
jmorse added a subscriber: StephenTozer.

Reshuffle this patch -- the "fixup half done instruction references" code moves to MachineFunction::finalizeDebugInstrRefs, and I've removed the ignore-widening-subreg-substitutions code. It didn't have any significant effect on variable locations, I think it was just dropping information.

Paging @StephenTozer , I've reshuffled debug instruction emission a little after the variadic-locations work landed -- I've moved single operand emission to InstrEmitter::EmitDbgValueFromSingleOp, and no-location emission to InstrEmitter::EmitDbgNoLocation. Now, empty variadic variable locations cannonicalise to DBG_VALUE $noreg which casuses a few test changes. arg-dbg-value-list.ll caught my eye -- can variadic locations not recover unused argument values right now? (I imagine this is tied up in the nest of special casing for arguments in SelectionDAG).

jmorse updated this revision to Diff 356543.Jul 5 2021, 11:53 AM

(Re upload with the SubReg = 0; bit actually taken out, earlier diff was a case of mistaken identity)

jmorse updated this revision to Diff 356689.Jul 6 2021, 5:31 AM

...and again with context this time...

Orlando accepted this revision.Jul 6 2021, 6:33 AM

Reshuffle this patch -- the "fixup half done instruction references" code moves to MachineFunction::finalizeDebugInstrRefs, and I've removed the ignore-widening-subreg-substitutions code. It didn't have any significant effect on variable locations, I think it was just dropping information.

SGTM

Paging @StephenTozer , I've reshuffled debug instruction emission a little after the variadic-locations work landed -- I've moved single operand emission to InstrEmitter::EmitDbgValueFromSingleOp, and no-location emission to InstrEmitter::EmitDbgNoLocation. Now, empty variadic variable locations cannonicalise to DBG_VALUE $noreg which casuses a few test changes. arg-dbg-value-list.ll caught my eye -- can variadic locations not recover unused argument values right now? (I imagine this is tied up in the nest of special casing for arguments in SelectionDAG).

I've left an inline comment about the empty variable location canonicalization (why it was the way it was before, plus a nit).

LGTM - though there are some clang-format nits plus my inline nit that need addressing before this lands.

llvm/lib/CodeGen/SelectionDAG/InstrEmitter.cpp
685–686

You can remove this bit of code now that you've hoisted the if (SD->isInvalidated()) outside of the if (SD->isVariadic()).

FWIW invalidated DBG_VALUE_LIST were emitted in this way ($noreg all the operands rather than emit a DBG_VALUE $noreg) with the aim of helping track down debug-info issues - the thinking being that the number of operands might hint at the identity of the DBG_VALUE when looking at print-after-all traces. I don't know whether that's a compelling enough reason to keep that way so I thought I'd just mention it in case you didn't' know, and let you decide.

This revision is now accepted and ready to land.Jul 6 2021, 6:33 AM
This revision was landed with ongoing or failed builds.Jul 6 2021, 10:41 AM
This revision was automatically updated to reflect the committed changes.