This is an archive of the discontinued LLVM Phabricator instance.

[DebugInfo][InstrRef] Track instructions that write-to-stack after having a spill fused into them
ClosedPublic

Authored by jmorse on Oct 7 2021, 8:33 AM.

Details

Summary

During register allocation, some instructions can have stack spills fused into them. It means that when vregs are allocated on the stack we can convert:

SETCCr %0
DBG_VALUE $0

to

SETCCm %stack.0
DBG_VALUE %stack.0

for example. DBG_VALUE instructions can simply have their vreg operands replaced with a stack location, and it all Just Works (TM). Unfortunately instruction referencing finds this harder: a store to the stack doesn't have a specific operand number, therefore we don't substitute the old operand for a new operand, and the location is dropped.

This patch implements a solution: just recognise the memory operand attached to an instruction with a Special Number (TM), and record a substitution between the old value and the new one. Thus we would have:

SETCCr %0 [...] debug-instr-number 1 :: (store (s32) into %stack.0)
DBG_INSTR_REF 1, 0

become

SETCCm %stack.0 [...] debug-instr-number 2 :: (store (s32) into %stack.0)
DBG_INSTR_REF 1, 0

With a substitution from operand (1, 0) to (2, MemoryOperand). In LiveDebugValues, we can identify the stack slot written to just like a spill instruction, and track its value from that point onwards.

This patch installs the substitution code into InlineSpiller: it's not that interesting, just adds a substitution. It's not completely clear to me what all the circumstances that the spiller runs with are, so I've filtered to only substitute in common cases, see the two MIR tests added.

In InstrRefBasedLDV, we now:

  • Recognise instructions that write to stack slots as def'ing that stack slot,
  • Allow DBG_INSTR_REFs that refer to a memory operand to "read" the value defined by that instruction, in the stack slot.

Diff Detail

Event Timeline

jmorse created this revision.Oct 7 2021, 8:33 AM
jmorse requested review of this revision.Oct 7 2021, 8:33 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 7 2021, 8:33 AM
jmorse added inline comments.Oct 11 2021, 10:18 AM
llvm/lib/CodeGen/LiveDebugValues/InstrRefBasedImpl.h
998

Should be !isAliased lost the inversion in copy+paste, happily the test catches this.

(from the commit message)

`becomes

SETCCm %stack.0 [...] debug-instr-number 2 :: (store (s32) into %stack.0)
DBG_INSTR_REF 1, 0

`
this actually becomes DBG_INSTR_REF 2, 0, right?

this actually becomes DBG_INSTR_REF 2, 0, right?

Actually no -- when optimisations are occurring with instruction referencing, I'm recording them in the "debugValueSubstitutions" table, which maps "old" instruction numbers onto new. Take a look at the MIR in memory-operand-tracking.mir which is added by this patch; there's a substitution from "2, 0" to "3, 1000000" (the large number signifying the memory operand).

Because instruction referencing gets rid of virtual register references, there's no efficient way to find the debug instructions that use a particular instruction number; perhaps we could make that happen, but I don't think we would gain anything from it. That's why I'm storing records of substitutions in a side-table, they're used to "patch up" the DBG_INSTR_REFs at the end of compilation.

(One downside of this approach is it adds extra steps for humans interpreting the debug-info, because they might have to interpret the substitutions; we could "patch up" DBG_INSTR_REFs as part of the printing process if that becomes a burden).

Aside from the current comments, LGTM.

llvm/lib/CodeGen/InlineSpiller.cpp
935
llvm/lib/CodeGen/LiveDebugValues/InstrRefBasedImpl.cpp
1395

Just out of curiosity, does this represent a bug prior to this patch, or is this just needed for the new functionality?

llvm/lib/CodeGen/MachineFunction.cpp
977
llvm/test/DebugInfo/MIR/InstrRef/memory-operand-folding-tieddef.mir
38

Nit, the unnecessary metadata (some of the long directories and version strings) can be removed.

jmorse added inline comments.Oct 12 2021, 4:15 AM
llvm/lib/CodeGen/LiveDebugValues/InstrRefBasedImpl.cpp
1395

I don't thiiiiink so: I've added this as an act of conservativeness, in case the target hooks recognise any stack write as being a stack spill. I suspect this doesn't happen for x86; perhaps it happens for other architectures though.

this actually becomes DBG_INSTR_REF 2, 0, right?

Actually no -- when optimisations are occurring with instruction referencing, I'm recording them in the "debugValueSubstitutions" table, which maps "old" instruction numbers onto new. Take a look at the MIR in memory-operand-tracking.mir which is added by this patch; there's a substitution from "2, 0" to "3, 1000000" (the large number signifying the memory operand).

Because instruction referencing gets rid of virtual register references, there's no efficient way to find the debug instructions that use a particular instruction number; perhaps we could make that happen, but I don't think we would gain anything from it. That's why I'm storing records of substitutions in a side-table, they're used to "patch up" the DBG_INSTR_REFs at the end of compilation.

(One downside of this approach is it adds extra steps for humans interpreting the debug-info, because they might have to interpret the substitutions; we could "patch up" DBG_INSTR_REFs as part of the printing process if that becomes a burden).

I see... Thanks for the explanation. Improving the MIR output could be a future-work, but I don't want to block anything here.

jmorse updated this revision to Diff 381731.Oct 23 2021, 6:03 AM

Update for Stephens nits -- but also to rebase on top of the the track-subregisters-on-the-stack patch, D112133. That exposes an interesting circumstance, which probably wants a little more review.

Now that there are more "indexes" within a spill slot that we can refer to, if we have:

INC32m $rsp, 1, $noreg, 4, [...]debug-instr-number 1 :: (store (s32) into %stack.0)
DBG_INSTR_REF 1, 1000000

Where the DBG_INSTR_REF refers to the memory operand of the INC32m: which stack index should this refer to? It could be 8/16/32/64 bits large. This is not a pointless question, because if the value is loaded off the stack, we need to try and follow it in the correct register.

Happily we can look at the memory operand on the INC32m and determine it's size; that then tells us which index on the stack the DBG_INSTR_REF should refer to, which is what I've added in this revision, in findLocationForMemOperand.

jmorse updated this revision to Diff 381803.Oct 24 2021, 2:40 PM

... and while I'm here, we should substitute instruction numbers when loads are folded into an instruction. See the added test memory-operand-load-folding.mir where the CVTTSS2SIrr (convert single precision number to signed int?) has a load folded into it. The instruction number should be preserved (with a substitution) for the sake of the value the instruction defines.

LGTM, though I'm not sure I feel qualified to press the green button myself in this instance. I left a few inline nits, plus please make the clang-format bot happy.

llvm/include/llvm/CodeGen/MachineFunction.h
549

Can you use const/constexpr here so that you can include the definition here in the header? Or is there some reason that this isn't const?

llvm/lib/CodeGen/LiveDebugValues/InstrRefBasedImpl.cpp
1452–1457

This is repeated a few lines above, between 1442-1445 (patch juggling?)

llvm/test/DebugInfo/MIR/InstrRef/memory-operand-tracking.mir
40

Is there a way to make the IR printer/reader to print/read MemOperand or something instead of 1000000 (if yes and you agree - this could be a separate patch)?

55–58

What is the purpose of this part of the test?

jmorse added inline comments.Oct 25 2021, 4:19 AM
llvm/test/DebugInfo/MIR/InstrRef/memory-operand-tracking.mir
55–58

It's testing that the old value, DBG_PHI $rax, 1, can't be found anywhere. It's moderately important as the INC32m's write to the stack should terminate any earlier values being tracked.

Thinking about it, this test should also check that variables _currently_ located on the stack get terminated if it's overwritten, i.e.:

SOMEINST debug-instr-number 1 ::(store 8 into %stack.0)
DBG_INSTR_REF 1, 1000000
OTHERINST ::(store 8 into %stack.0)

needs an explicit DBG_VALUE $noreg after OTHERINST overwrites the stack locataion. DbgEntityHistoryCalculator isn't aware of memory stores.

55–58

(To uh, be clear, I'll fold better explanations into this).

This revision was not accepted when it landed; it landed in state Needs Review.Oct 25 2021, 7:15 AM
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
jmorse added inline comments.Oct 25 2021, 7:21 AM
llvm/lib/CodeGen/LiveDebugValues/InstrRefBasedImpl.cpp
1452–1457

Rats, missed this, and you're right it's patch juggling. I'll whip it out in an NFC commit later.

llvm/test/DebugInfo/MIR/InstrRef/memory-operand-tracking.mir
40

I think for it to be well-formed JSON, we'd need some kind of flag indicating whether it's a "normal" operand or a memory reference, then make dstop optional. While that's a pain, it's probably easier to read; will need to be a later patch though.

thakis added a subscriber: thakis.Oct 25 2021, 8:28 AM

Looks like this breaks tests on arm macs: http://45.33.8.238/macm1/20568/step_11.txt (...or possibly on machines that default to non-intel triples in general?)

Please take a look!

Looks like this breaks tests on arm macs: http://45.33.8.238/macm1/20568/step_11.txt (...or possibly on machines that default to non-intel triples in general?)

Please take a look!

Argh, yep, that's another "all the world's an x86" fault for me there, fix up in a moment.

Thanks for the fix!