This is an archive of the discontinued LLVM Phabricator instance.

[DebugInfo][NFC?] Add new MachineOperand type and change DBG_INSTR_REF syntax
ClosedPublic

Authored by StephenTozer on Jul 8 2022, 7:19 AM.

Details

Summary

This patch implements the proposals in the RFC made earlier this year.

This patch makes two notable changes to the MIR debug info representation, which result in different MIR output but identical final DWARF output (NFC w.r.t. the full compilation). The two changes are:

  1. The introduction of a new MachineOperand type, MO_DbgInstrRef, which consists of two unsigned numbers that are used to index an instruction and an output operand within that instruction, having a meaning identical to first two operands of the current DBG_INSTR_REF instruction. This operand is only used in DBG_INSTR_REF (see below).
  2. A change in syntax for the DBG_INSTR_REF instruction, shuffling the operands to make it resemble DBG_VALUE_LIST instead of DBG_VALUE, and replacing the first two operands with a single MO_DbgInstrRef-type operand.

Combined, these changes do not affect anything about the debug info that we produce or how we handle it throughout CodeGen, but instead simply changes the way we represent instruction references:

Old: DBG_INSTR_REF 1, 0, !123, !456
New: DBG_INSTR_REF !123, !456, dbg-instr-ref(1, 0)

The motivation for this, as described in the RFC, is that we wish to be able to use instruction references in variadic debug values. One of the key difficulties with variadic debug values is that they must track the lifetimes of every register that they refer to, making them brittle to things like hoisting or sinking instructions - even if there is a valid range where a variadic debug value can be computed, the debug value may be terminated by shuffling instructions. As instruction references are more resilient against this sort of thing by referring to values produced by instructions instead of registers, freeing us from the need to move the debug values themselves around and preemptively terminate their ranges, it may be beneficial to allow variadic debug values to take advantage of this behaviour.

This patch does not grant this functionality, but takes care of all the foundational work to implement this feature, allowing the next patch to focus entirely on functional changes.

Diff Detail

Event Timeline

StephenTozer created this revision.Jul 8 2022, 7:19 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 8 2022, 7:19 AM
StephenTozer requested review of this revision.Jul 8 2022, 7:19 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 8 2022, 7:19 AM

Update diff to give full context, fix up a couple of slightly-inconsistently-formatted test expectations.

Update patch with some fixes, remove tests from the patch (to bundle all the test changes into one patch later on).

jmorse requested changes to this revision.Sep 16 2022, 9:40 AM

This all seems fine and well, I've added some nits about spellings and how things present to the developer,

This is missing test changes to account for the fact that all LL -> MIR variable locations will now be spelt differently, that should lead to a bunch of tests failing. Plus: a round-trip test for the syntax of the new operand would be good.

Seeing how this patch moves everything over to use the new operand form, how about adding a MachineVerifier check that DBG_INSTR_REF's have the correct format? That's going to catch any MIR tests where the old syntax is used, which would break in strange ways when fed into LiveDebugValues.

llvm/include/llvm/CodeGen/MachineInstr.h
1266

Un-necessary move of the label -- quite a nit, but future code archeologists looking at git blame will appreciate it not having been moved.

llvm/include/llvm/CodeGen/MachineOperand.h
67

Any particular reason for this not being at the end? I don't think it makes a difference, but on the principle of least surprise (and in case someone out there is preserving these things) it'd be better to append to this list.

599–602

I fear that this method name is going to be misinterpreted by the reader as returning what the operands index is in the owning instruction. Obviously this is a bother to think about, but worth considering. Possibly prefixing Dbg before 'Instr' and 'Op' will be clearer?

Also, for consistency, IMO it should be "Index" instead of Idx, the following method says "Index".

Similarly with the setters.

llvm/lib/CodeGen/InlineSpiller.cpp
1066 ↗(On Diff #460351)

This seems out of place in a "adding-the-machine-operand-type" patch, would it be better re-homed into one of the later ones? If there's no specific error mode in mind where this could fire, it's probably ok to fold in as just a "general improvement"

llvm/lib/CodeGen/LiveDebugValues/InstrRefBasedImpl.cpp
1324

Unrelated whitespace changes undesirable

llvm/lib/CodeGen/MIRParser/MILexer.cpp
282

clang-format fixing best done in a separate / independent patch. I believe so long as it's an obvious 80-column fix or clang-format problem, you can just push it up without review.

llvm/lib/CodeGen/MIRParser/MIParser.cpp
2721–2722

If there's an obvious way to assert for that, then yes please.

llvm/lib/CodeGen/MachineFunction.cpp
1192–1194

AFAIUI this isn't necessary at this stage (and could lead to test breakage), is that right? If so, better to fold into a "making DBG_VALUE_LIST everywhere" (or whatever it gets called in the end) patch.

llvm/lib/CodeGen/MachineSink.cpp
571–574

Isn't it necessary for the variable information to go into SeenDbgVars, so that a DBG_VALUE can't re-order with a DBG_INSTR_REF

llvm/lib/CodeGen/SelectionDAG/FastISel.cpp
1255

IMO easier to read by keeping the original spelling (UseInstrRefDebugInfo && Op->isReg()) and just swapping the order of blocks -- inverting the logic causes the reader to think something complicated is going on.

This revision now requires changes to proceed.Sep 16 2022, 9:40 AM

Ah, OK, you threw me by having the test updates in the final patch -- IMO better to have them in each patch themselves, as you're going to have to land them like that. Could you stage those test changes into this patch please?

Ah, OK, you threw me by having the test updates in the final patch -- IMO better to have them in each patch themselves, as you're going to have to land them like that. Could you stage those test changes into this patch please?

Right, it is somewhat confusing - the reason for this was mainly that this and another patch in this stack both change the way DBG_INSTR_REFs will be printed, and so it would probably create a lot of noise having both patches update every test that uses DBG_INSTR_REF. If you feel it would be better represented with all the changes in each patch exactly as they'll land, then I can do so.

Return tests to this patch, and address the remaining review comments.

llvm/include/llvm/CodeGen/MachineOperand.h
599–602

That's a reasonable point - I'd probably object to using Dbg as the prefix though, as DbgOp is used elsewhere to refer to the operands in debug value (and as of this patch, in a DBG_INSTR_REF!). It's a long function name, but I think using InstrRef as the prefix is pretty unambiguous.

llvm/lib/CodeGen/MachineFunction.cpp
1192–1194

The reason for making this change is just that it keeps the undef-ing simple, as converting it to a DBG_VALUE would now require us to move and add operands, and after the next patch in the stack also modify the DIExpression. Since it's an undef value either way it doesn't make any difference in the final output, and AFAIK doesn't cause much/any breakage with existing tests. But if you think the tradeoff is worth it then I'm open to doing the DBG_VALUE conversion, WDYT?

jmorse accepted this revision.Oct 6 2022, 3:49 AM

LGTM; the test changes from this patch are reassuringly mechanical!

llvm/include/llvm/CodeGen/MachineOperand.h
599–602

SGTM

llvm/lib/CodeGen/MachineFunction.cpp
1192–1194

SGTM

This revision is now accepted and ready to land.Oct 6 2022, 3:49 AM
jmorse added inline comments.Oct 6 2022, 3:50 AM
llvm/lib/CodeGen/MachineFunction.cpp
1192–1194

(i.e.: that sounds like a fine reason to make this change, no need for further work)

Rebased onto more recent state.

Rebased again now that the patch stack is ready to land.

This revision was landed with ongoing or failed builds.Jan 6 2023, 10:04 AM
This revision was automatically updated to reflect the committed changes.