This is an archive of the discontinued LLVM Phabricator instance.

[DebugInfo][InstCombine] Prefer salvaging dbg.values over sinking them
ClosedPublic

Authored by jmorse on Jan 16 2019, 8:41 AM.

Details

Summary

When sinking an instruction InstCombine currently sinks any local dbg.value users -- something that potentially re-orders variables. I've experienced some circumstances where pointer casts (and associated dbg.values) get sunk across multiple blocks to the point where they're used, artificially shortening the location range of the corresponding debuginfo variable.

Instead of sinking everything, attempt to salvage the dbg.value first. This requires exposing a more expressive form of salvageDebugInfo where we can specify which debug users to attempt to salvage -- otherwise many debug users would be needlessly rewritten. If un-successful, the debug users gets sunk as it does now to prevent debug-use-before-def, if successful it stays. While we're here, fix the salvageDebugUsers' salvaging of GEPs to return false if it's unsalvagable.

Testing for this behaviour happens in the updated test -- where we can push through both the sunk addition, and even the load, to keep all dbg.values in their original locations.

(This patch makes minimal differences to a build of clang-3.4, once more due to our old friend placeDbgValues, but is a worthwhile improvement IMHO).

Diff Detail

Event Timeline

jmorse created this revision.Jan 16 2019, 8:41 AM

Seems reasonable to me.

include/llvm/Transforms/Utils/Local.h
348

ArrayRef ?

jmorse updated this revision to Diff 182264.Jan 17 2019, 5:53 AM
jmorse marked an inline comment as done.

Use an ArrayRef rather than a SmallVectorImpl

I think this looks reasonable, thanks!

bjope added inline comments.Jan 31 2019, 12:17 AM
test/Transforms/InstCombine/debuginfo_add.ll
52–53

Shouldn't there still be som dbg.value for variable !25 after the load, indicating that the variable is in %0?

Have you checked what happens in the final output? It all might depend on which live range that is longest (%start or %0), but if this load is the last user of %start we only know how to get the value of the variable !25 from %0 after the load.

I'm also not sure how well LLVM handles these dbg.value intrinsics with DW_OP_deref. What defines the end of the debug range for those? The end of the BB? The next instruction that potentially can write to memory? When the pointer register is dead? The use of DW_OP_deref (in opt) has been quite rare so far (afaik). A direct mapping to the SSA value is a more well established concept (the dbg.value is valid for the range of the SSA value, or up until the next dbg.value that indicates that the variable is somewhere else). Perhaps also limited to the current BB (or is that only for DBG_VALUE and not dbg.value? or only after de-SSA?). But when we say that the variable is in memory (and not a unique stack slot for the variable), how do we know when this isn't valid any longer when calculating debug value ranges?

jmorse marked an inline comment as done.Jan 31 2019, 7:35 AM
jmorse added inline comments.
test/Transforms/InstCombine/debuginfo_add.ll
52–53

Shouldn't there still be some dbg.value for variable !25 after the load, indicating that the variable is in %0?

Have you checked what happens in the final output? It all might depend on which live range that is longest (%start or %0), but if this load is the last user of %start we only know how to get the value of the variable !25 from %0 after the load.

Definitely true -- experimentally testing this on a build of clang-3.4, with the patch as it is we give 75.07% of all variables locations, and cover 45.15% of scope bytes; placing a dbg.value at the sunk location too produces 75.16% location coverage and 45.71%. Which is a small but non-trivial improvement.

The downside is that we will leave a dbg.value in each block an instruction sinks through -- and if multiple memory computations are salvaged through (load then gep then ptrcast) each one will leave a dbg.value in blocks sunk through. That being said, a few timed builds of clang-3.4 show the performance penalty as being less than 0.5%, which is well within error margins.

I'm also not sure how well LLVM handles these dbg.value intrinsics with DW_OP_deref. [Various possibilities].

A great question, that I don't know the answer to right now. I'll try and break some examples using this patch.

jmorse marked an inline comment as done.Feb 6 2019, 7:31 AM

I suspect this change now blocks on https://bugs.llvm.org/show_bug.cgi?id=40628 (see inline comments), or at least until there's more understanding of how to resolve problems with new DW_OP_derefs being introduced.

test/Transforms/InstCombine/debuginfo_add.ll
52–53

I'm also not sure how well LLVM handles these dbg.value intrinsics with DW_OP_deref.

Unfortunately it would appear there's no termination of such location ranges at all, and a variable that becomes a salvaged load can observe subsequent stores through the DW_OP_deref expression. I've filed this as https://bugs.llvm.org/show_bug.cgi?id=40628

jmorse updated this revision to Diff 186479.Feb 12 2019, 8:32 AM
jmorse marked 2 inline comments as done.

The issue with generating new DW_OP_derefs has been resolved by r353824, as a result we no longer change existing tests with this patch.

I realised at the same time that we should be emitting undef dbg.values at this stage: if we sink a dbg.value and cannot salvage it, then we should terminate any earlier location range. Trying the clang-3.4 build statistics I get the following results based on r353832 (note that the llvm-dwarfdump --statistics formula changed recently). First column variables-with-location coverage, second scope-bytes-covered.

r353832: 76.9%, 44.8%
salvage, no undef: 77.2%, 44.9%
salvage and undef: 77.2%, 45.5%

Adding the undefs loses 300 variable locations (0.008%), and yields 0.6% more scope bytes covered. Exactly where the scope-bytes-covered improvement comes from is a bit of a mystery to me, but it's an improvement nonetheless.

Note that I've added a reverse() on the iteration over DbgUsers... without this, the order of some dbg.values in Transforms/InstCombine/debuginfo_add.ll swap (!25 and !26). I.. don't really have a good explanation as to why.

(Plus I added a new test to cover this behaviour, and removed some plumbing for salvageDebugUsersForDbgValues that was folded into an earlier commit).

aprantl accepted this revision.Feb 12 2019, 10:05 AM
aprantl added inline comments.
lib/Transforms/InstCombine/InstructionCombining.cpp
3102–3106

"is local to" -> is in the same basic block ?

3115

Should we add a replaceWithUndef method to DebugInfoIntrinsicInst?

This revision is now accepted and ready to land.Feb 12 2019, 10:05 AM
bjope added a comment.Feb 12 2019, 2:34 PM

LGTM as well!

jmorse marked 4 inline comments as done.Feb 13 2019, 2:33 AM
jmorse added inline comments.
lib/Transforms/InstCombine/InstructionCombining.cpp
3102–3106

Folding into commit,

3115

We definitely should -- this should be a common operation by optimisations. (I'll file a follow-up at some point).

jmorse marked 2 inline comments as done.Feb 13 2019, 2:33 AM

Also, thanks for reviews!

This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptFeb 13 2019, 2:54 AM
bjope added inline comments.Feb 14 2019, 1:28 AM
llvm/trunk/lib/Transforms/InstCombine/InstructionCombining.cpp
3122 ↗(On Diff #186616)

(post commit comment)
I see things like this in the IR after this:

call void @llvm.dbg.value(metadata %foo * undef, metadata !2292, metadata !DIExpression(DW_OP_plus_uconst, 3, DW_OP_stack_value)), !dbg !2760

It doesn't fill any purpose to have a complicated DIExpression when being based on an undef value afaict.
Maybe we should strip away the DIExpression right away here, or what do you think?

jmorse marked an inline comment as done.Feb 14 2019, 3:26 AM
jmorse added inline comments.
llvm/trunk/lib/Transforms/InstCombine/InstructionCombining.cpp
3122 ↗(On Diff #186616)

I agree, it's pointless memory use and a distraction (or even misleading). This should probably be rolled into a DbgVariableIntrinsic method that sets operand-0 to undef *and* clears the expression.

I imagine that for fragments of larger variables though, we would need to keep the DW_OP_LLVM_fragment so that only that portion of the variable gets undef'd.

(I can't follow this up with code for about a week due to other backlogs alas).

bjope added inline comments.Feb 14 2019, 3:47 AM
llvm/trunk/lib/Transforms/InstCombine/InstructionCombining.cpp
3122 ↗(On Diff #186616)

Yes, good catch. We still need it (or at least parts of the DIExpression) for fragments, I did not think about that.

Anyway, this won't cost anything in the final DWARF location lists, so it is not that important. It is just a little bit confusing when looking at the IR, and it might have a negligible cost in the IR.

(Maybe I'll find some time to fix it. Although not on top of the priority list right now. And I do not think it is worth writing a PR for this.)