Page MenuHomePhabricator

[DebugInfo][DAG] Either salvage dangling debug info or emit Undef DBG_VALUEs

Authored by jmorse on Mon, Feb 4, 9:20 AM.



In this patch SelectionDAG tries to salvage any dbg.values that are going to be dropped, in case they can be recovered from Values in the current BB. It also strengthens SelectionDAGs handling of dangling debug data, so that dbg.values are *always* emitted (as Undef or otherwise) instead of dangling forever.

The motivation behind this patch exists in the new test case: a memory address (here a bitcast and GEP) exist in one basic block, and a dbg.value referring to the address is left in the 'next' block. The base pointer is live across all basic blocks. In current llvm trunk the dbg.value cannot be encoded, and it isn't even emitted as an Undef DBG_VALUE.

Note at this point that this change depends on two NFC refactors; I'll edit their phab-diff numbers in here when they're up in a moment. One refactors the dbg_value-to-SDDbgValue functionality into a utility method (handleDebugValue), the other refactors salvageDebugInfo to allow salvaging things that aren't necessarily DbgValueInsts (salvageDebugInfoImpl).

The change is simply: if we're definitely going to drop a dbg.value, repeatedly apply salvageDebugInfo to its operand until either we find something that can be encoded, or we can't salvage any further in which case we produce an Undef DBG_VALUE. To know when we're "definitely going to drop a dbg.value", SelectionDAG signals SelectionDAGBuilder when all IR instructions have been encoded to force salvaging. This ensures that any dbg.value that's dangling after DAG creation will have a corresponding DBG_VALUE encoded. (There's a FIXME about this in the dbg_value intrinsic handler).

The sdag-dangling-dbgvalue.ll test changes and gets some new Undef DBG_VALUEs: each time it's because there's a dbg.value of an unseen SDNode, immediately followed by a dbg.value for the same variable of a null constant. IMO this is expected: the dbg.values with unseen operands were previously silently dropped, emitting Undefs is better. The fact that there are two in a row for the same variable is because that's what's in the test input.

Finally some clang-3.4 debuginfo statistics: I've used D57584 as a base because this patch depends on it, and measure variables-with-location coverage and bytes-in-scope coverage. I've also measured four flavours of this patch:

  • as-is (plain)
  • no force-salvage signal from SelectionDAG
  • no new Undef DBG_VALUE generation.
  • Neither Undef nor force-salvage
        |  Pct var cov | Pct scope-bytes |
D57584  |     75.4     |       46.9      |
Plain   |     75.2     |       47.8      |
No-force|     75.4     |       47.0      |
No-undef|     75.7     |       48.3      |
No-undef|              |                 |
 and no |              |                 |
 force  |     75.4     |       46.9      |

The headline figures for "plain" is that there's an almost-1% improvement in scope-bytes coverage; but a 0.2% drop in percent variables covered. My interpretation is that forcing salvaging is good (No-undef has good numbers), and the reduction in variable coverage comes from the new Undef DBG_VALUEs are being generated where they weren't before. Which IMO is an accuracy improvement.

The reason for more Undefs reducing percent variable coverage is partially due to there being more source-variables in the output binaries as a result. Speculation: I reckon some of this reduction is because we sometimes hoist dbg.values outside of their lexical scopes (due to our old friend placeDbgValues) and now that we correctly terminate dangling dbg.values, we find that some of their ranges are no longer anywhere in their lexical scopes, leading to no location at all.

Diff Detail


Event Timeline

jmorse created this revision.Mon, Feb 4, 9:20 AM
Herald added a project: Restricted Project. · View Herald TranscriptMon, Feb 4, 9:20 AM
jmorse edited the summary of this revision. (Show Details)Mon, Feb 4, 9:21 AM

Looks mostly good!

1195 ↗(On Diff #185066)

That comment should be in the function body or in the header.

Perhaps rename the function so it contains the word salvage?

1200 ↗(On Diff #185066)

variable name & explicit type?

1221 ↗(On Diff #185066)


1239 ↗(On Diff #185066)


1240 ↗(On Diff #185066)

Shouldn't we better use the type that was in the dbg.value for consistency? It makes no semantic difference, but it makes the IR less confusing to read.

1244 ↗(On Diff #185066)

debug location is usually used to refer to DILocation. Perhaps just say "debug value" instead?

678 ↗(On Diff #185066)

found a typo in the context

688 ↗(On Diff #185066)


jmorse updated this revision to Diff 185086.Mon, Feb 4, 10:38 AM

Address comments; also re-instate and revise the 'TODO' comment I was deleting. After reading it again, I think it's not referring to the problem that this patch solves.

jmorse marked 8 inline comments as done.Mon, Feb 4, 10:39 AM
aprantl accepted this revision.Mon, Feb 4, 12:20 PM
aprantl added inline comments.
682 ↗(On Diff #185086)

Please drop the resolveDanglingDebugInfo -

5 ↗(On Diff #185086)

can you add a check for the next: label, too?

This revision is now accepted and ready to land.Mon, Feb 4, 12:20 PM
This revision was automatically updated to reflect the committed changes.