In this patch SelectionDAG tries to salvage any dbg.values that are going to be dropped, in case they can be recovered from Values in the current BB. It also strengthens SelectionDAGs handling of dangling debug data, so that dbg.values are *always* emitted (as Undef or otherwise) instead of dangling forever.
The motivation behind this patch exists in the new test case: a memory address (here a bitcast and GEP) exist in one basic block, and a dbg.value referring to the address is left in the 'next' block. The base pointer is live across all basic blocks. In current llvm trunk the dbg.value cannot be encoded, and it isn't even emitted as an Undef DBG_VALUE.
Note at this point that this change depends on two NFC refactors; I'll edit their phab-diff numbers in here when they're up in a moment. One refactors the dbg_value-to-SDDbgValue functionality into a utility method (handleDebugValue), the other refactors salvageDebugInfo to allow salvaging things that aren't necessarily DbgValueInsts (salvageDebugInfoImpl).
The change is simply: if we're definitely going to drop a dbg.value, repeatedly apply salvageDebugInfo to its operand until either we find something that can be encoded, or we can't salvage any further in which case we produce an Undef DBG_VALUE. To know when we're "definitely going to drop a dbg.value", SelectionDAG signals SelectionDAGBuilder when all IR instructions have been encoded to force salvaging. This ensures that any dbg.value that's dangling after DAG creation will have a corresponding DBG_VALUE encoded. (There's a FIXME about this in the dbg_value intrinsic handler).
The sdag-dangling-dbgvalue.ll test changes and gets some new Undef DBG_VALUEs: each time it's because there's a dbg.value of an unseen SDNode, immediately followed by a dbg.value for the same variable of a null constant. IMO this is expected: the dbg.values with unseen operands were previously silently dropped, emitting Undefs is better. The fact that there are two in a row for the same variable is because that's what's in the test input.
Finally some clang-3.4 debuginfo statistics: I've used D57584 as a base because this patch depends on it, and measure variables-with-location coverage and bytes-in-scope coverage. I've also measured four flavours of this patch:
- as-is (plain)
- no force-salvage signal from SelectionDAG
- no new Undef DBG_VALUE generation.
- Neither Undef nor force-salvage
| Pct var cov | Pct scope-bytes | +----------------------------------------- D57584 | 75.4 | 46.9 | Plain | 75.2 | 47.8 | No-force| 75.4 | 47.0 | No-undef| 75.7 | 48.3 | No-undef| | | and no | | | force | 75.4 | 46.9 |
The headline figures for "plain" is that there's an almost-1% improvement in scope-bytes coverage; but a 0.2% drop in percent variables covered. My interpretation is that forcing salvaging is good (No-undef has good numbers), and the reduction in variable coverage comes from the new Undef DBG_VALUEs are being generated where they weren't before. Which IMO is an accuracy improvement.
The reason for more Undefs reducing percent variable coverage is partially due to there being more source-variables in the output binaries as a result. Speculation: I reckon some of this reduction is because we sometimes hoist dbg.values outside of their lexical scopes (due to our old friend placeDbgValues) and now that we correctly terminate dangling dbg.values, we find that some of their ranges are no longer anywhere in their lexical scopes, leading to no location at all.
I don't think DenseMap iteration order is guaranteed, unlike std::map?