This is an archive of the discontinued LLVM Phabricator instance.

[Debugify][Original DI] Test preservation of original debug var intrinsics in optimizations
ClosedPublic

Authored by djtodoro on Apr 20 2021, 5:12 AM.

Details

Summary

This is an improvement of [0]. This adds checking of original llvm.dbg.values()/declares() instructions in optimizations.

We have picked a real issue that has been found with this (actually, picked one variable location missing from [1] and resolved the issue), and the result is the fix for that -- D100844.

Before applying the D100844, using the options from [0] (but with this patch applied) on the compilation of GDB 7.11, the final HTML report for the debug-info issues can be found at [1] (please scroll down, and look for "Summary of Variable Location Bugs"). After applying the D100844, the numbers has improved a bit -- please take a look into [2].

[0] https://llvm.org/docs/HowToUpdateDebugInfo.html#test-original-debug-info-preservation-in-optimizations
[1] https://djolertrk.github.io/di-check-before-adce-fix/
[2] https://djolertrk.github.io/di-check-after-adce-fix/

Diff Detail

Event Timeline

djtodoro created this revision.Apr 20 2021, 5:12 AM
djtodoro requested review of this revision.Apr 20 2021, 5:12 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 20 2021, 5:12 AM

Hi @djtodoro I am looking at this (slowly, sorry!).

IIUC this checks that the number of debug intrinsics for each non-inlined variable does not decrease after each optimisation pass. There are legitimate reasons for deleting debug intrinsics. RemoveRedundantDbgInstrs, for example, which is called in a few places. With that in mind I think I would prefer the table header "Number of bugs" to change as this number seems to be more of an indicator of possible bugs, rather than proof of their existence.

You might be able to reduce (*) the number of false positive results by ignoring "redundant" debug intrinsics in the count, though maybe that could be follow-up work?

(*) Another example of a false positive, or at least a true positive that we probably want to ignore: I tried this patch out out using a different target codebase and investigated a debug intrinsic dropped in SROA/mem2reg. Amusingly, git blame pointed right back at me: D89810. D89810, and D85555 which the former is based on, deliberately remove dbg.value+DW_OP_deref intrinsics to work around a common case of location coverage reduction ultimately caused by LowerDbgDeclare.

@Orlando Thanks for looking into this.

Hi @djtodoro I am looking at this (slowly, sorry!).

IIUC this checks that the number of debug intrinsics for each non-inlined variable does not decrease after each optimisation pass. There are legitimate reasons for deleting debug intrinsics. RemoveRedundantDbgInstrs, for example, which is called in a few places. With that in mind I think I would prefer the table header "Number of bugs" to change as this number seems to be more of an indicator of possible bugs, rather than proof of their existence.

Sure. That is why we treat it as "WARNING". The same is happening with dropping of dbg location attached to instructions -- there are places where we cannot salvage !dbg, and we treat it as a warning as well (until we add some super cool logic to distinguish what was reasonable drop or opposite).

You might be able to reduce (*) the number of false positive results by ignoring "redundant" debug intrinsics in the count, though maybe that could be follow-up work?

That could be follow-up for sure. Thanks. In addition, I am seeing a lot of artifitial variables location dropping (such as __result, __s etc.), and I am not sure if we should care about these.

(*) Another example of a false positive, or at least a true positive that we probably want to ignore: I tried this patch out out using a different target codebase and investigated a debug intrinsic dropped in SROA/mem2reg. Amusingly, git blame pointed right back at me: D89810. D89810, and D85555 which the former is based on, deliberately remove dbg.value+DW_OP_deref intrinsics to work around a common case of location coverage reduction ultimately caused by LowerDbgDeclare.

@Orlando Thanks for looking into this.

Hi @djtodoro I am looking at this (slowly, sorry!).

IIUC this checks that the number of debug intrinsics for each non-inlined variable does not decrease after each optimisation pass. There are legitimate reasons for deleting debug intrinsics. RemoveRedundantDbgInstrs, for example, which is called in a few places. With that in mind I think I would prefer the table header "Number of bugs" to change as this number seems to be more of an indicator of possible bugs, rather than proof of their existence.

Sure. That is why we treat it as "WARNING". The same is happening with dropping of dbg location attached to instructions -- there are places where we cannot salvage !dbg, and we treat it as a warning as well (until we add some super cool logic to distinguish what was reasonable drop or opposite).

That makes sense, cool. Out of curiosity, did you run into many false positives before you found the ADCE bug (D100844)?

I investigated another couple of dropped locations this morning which both look like false positive results too. One was in ipsccp where a dead block which contains a dbg.value is deleted. The other was in Early CSE w/ MemorySSA, which looks like it was being flagged because it makes a dbg.value undef. I've made an inline comment about this last point - undef dbg.values are currently not counted by this patch and I'm not sure if that should be the case.

You might be able to reduce (*) the number of false positive results by ignoring "redundant" debug intrinsics in the count, though maybe that could be follow-up work?

That could be follow-up for sure. Thanks. In addition, I am seeing a lot of artifitial variables location dropping (such as __result, __s etc.), and I am not sure if we should care about these.

I think the artificial variables are still useful to us. It may not matter to users so much if the locations are incorrect, but for us the intrinsics just provide more test coverage for this tool. wdyt?

llvm/lib/Transforms/Utils/Debugify.cpp
565

Why skip undef values here? Skipping undef values in the count means that any pass that makes a debug intrinsic undef will be flagged.

While making a debug intrinsic undef can sometimes be suboptimal (i.e. where a salvage or value replacement is possible), I don't think it ever reduces the correctness of the debug info.

@Orlando Thanks for looking into this.

Hi @djtodoro I am looking at this (slowly, sorry!).

IIUC this checks that the number of debug intrinsics for each non-inlined variable does not decrease after each optimisation pass. There are legitimate reasons for deleting debug intrinsics. RemoveRedundantDbgInstrs, for example, which is called in a few places. With that in mind I think I would prefer the table header "Number of bugs" to change as this number seems to be more of an indicator of possible bugs, rather than proof of their existence.

Sure. That is why we treat it as "WARNING". The same is happening with dropping of dbg location attached to instructions -- there are places where we cannot salvage !dbg, and we treat it as a warning as well (until we add some super cool logic to distinguish what was reasonable drop or opposite).

That makes sense, cool. Out of curiosity, did you run into many false positives before you found the ADCE bug (D100844)?

I am seeing some false positives for sure (most of these artifital vars locs are being removed after some "constant propagation" passes). We definitely need to investigate how to remove these false positives here (for both, instr and var locations).

I investigated another couple of dropped locations this morning which both look like false positive results too. One was in ipsccp where a dead block which contains a dbg.value is deleted. The other was in Early CSE w/ MemorySSA, which looks like it was being flagged because it makes a dbg.value undef. I've made an inline comment about this last point - undef dbg.values are currently not counted by this patch and I'm not sure if that should be the case.

I don't know how (yet), but we can (somehow) mark some places in pipeline as "valid for dropping" or something like that. Please note that not all setting of "undef" is the only thing we can do -- for example, calling the salvageDebugInfo() sometimes prevents a pass of making some llvm.dbg.value() first operand as undef.

You might be able to reduce (*) the number of false positive results by ignoring "redundant" debug intrinsics in the count, though maybe that could be follow-up work?

That could be follow-up for sure. Thanks. In addition, I am seeing a lot of artifitial variables location dropping (such as __result, __s etc.), and I am not sure if we should care about these.

I think the artificial variables are still useful to us. It may not matter to users so much if the locations are incorrect, but for us the intrinsics just provide more test coverage for this tool. wdyt?

llvm/lib/Transforms/Utils/Debugify.cpp
565

While making a debug intrinsic undef can sometimes be suboptimal (i.e. where a salvage or value replacement is possible), I don't think it ever reduces the correctness of the debug info.

Hmmm... it does not affect the correctness, but it does affect completeness (e.g. in optimized code, variable should be covered for the places it is alive, but in many cases it is impossible). How are we going to find cases where we have missed to call salvageDebugInfo()?
In addition, we can add some checking levels -- e.g. there could be level that doesn't consider "undef" locations, which will reduce the number of false positives, for sure...

I am seeing some false positives for sure (most of these artifital vars locs are being removed after some "constant propagation" passes). We definitely need to investigate how to remove these false positives here (for both, instr and var locations).

I guess that can all fall under "follow up work" too? I think it is worth mentioning the caveat that there are known false positive cases in https://llvm.org/docs/HowToUpdateDebugInfo.html either way (for both source locations and this new debug intrinsic checking).

Bringing the undef discussion out from the inline comments to reduce clutter:

@Orlando said:
While making a debug intrinsic undef can sometimes be suboptimal (i.e. where a salvage or value replacement is possible), I don't think it ever reduces the correctness of the debug info.

@djtodoro said:
Hmmm... it does not affect the correctness, but it does affect completeness (e.g. in optimized code, variable should be covered for the places it is alive, but in many cases it is impossible). How are we going to find cases where we have missed to call salvageDebugInfo()?
In addition, we can add some checking levels -- e.g. there could be level that doesn't consider "undef" locations, which will reduce the number of false positives, for sure...

The best thing to do here probably depends on the goals of the tool. Is it trying to find passes that introduce correctness bugs or find passes that reduce coverage unnecessarily? If the answer is "both" then I agree that a checking-level option is probably a good way forward to help reduce the noise when looking for correctness bugs.

On the whole this SGTM but I haven't contributed to or recently used debugify so I would like see if anyone else has any comments on the approach in general before reviewing the code changes.

I am seeing some false positives for sure (most of these artifital vars locs are being removed after some "constant propagation" passes). We definitely need to investigate how to remove these false positives here (for both, instr and var locations).

I guess that can all fall under "follow up work" too? I think it is worth mentioning the caveat that there are known false positive cases in https://llvm.org/docs/HowToUpdateDebugInfo.html either way (for both source locations and this new debug intrinsic checking).

Good suggestion, thanks!

Bringing the undef discussion out from the inline comments to reduce clutter:

@Orlando said:
While making a debug intrinsic undef can sometimes be suboptimal (i.e. where a salvage or value replacement is possible), I don't think it ever reduces the correctness of the debug info.

@djtodoro said:
Hmmm... it does not affect the correctness, but it does affect completeness (e.g. in optimized code, variable should be covered for the places it is alive, but in many cases it is impossible). How are we going to find cases where we have missed to call salvageDebugInfo()?
In addition, we can add some checking levels -- e.g. there could be level that doesn't consider "undef" locations, which will reduce the number of false positives, for sure...

The best thing to do here probably depends on the goals of the tool. Is it trying to find passes that introduce correctness bugs or find passes that reduce coverage unnecessarily? If the answer is "both" then I agree that a checking-level option is probably a good way forward to help reduce the noise when looking for correctness bugs.

On the whole this SGTM but I haven't contributed to or recently used debugify so I would like see if anyone else has any comments on the approach in general before reviewing the code changes.

djtodoro updated this revision to Diff 344768.May 12 2021, 4:54 AM
  • Add note for in docs for known false positives
  • Refactor the code a bit
  • Rebase

This looks mostly good to me right now - I haven't gone through the HTML template code in-depth, but it looks to do what it says on the tin at least. I agree with the prior discussion that we will probably see a lot of false positives emerge from this, but I think the most important thing is that we have the tool. Once we have a larger spread of results, we can examine them and determine how best to tackle false positives so that we can get useful output.

llvm/lib/Transforms/Utils/Debugify.cpp
331

As of the current main, there is an isUndef function in DbgVariableIntrinsic that can be used for this check - though I've just noticed that it doesn't account for null operands, which is also something that should be handled in principle (since they aren't explicitly disallowed, and may be produced in a few places).

566

DVI->isUndef() can be used instead, as in above comment.

djtodoro added inline comments.May 17 2021, 8:54 AM
llvm/lib/Transforms/Utils/Debugify.cpp
331

OK, thanks -- I think the null op should be handled as well.

djtodoro updated this revision to Diff 345904.May 17 2021, 8:56 AM
  • address comments

This is ready to go now, right?

StephenTozer accepted this revision.May 18 2021, 3:09 AM

LGTM - since this adds new output to debugify I would prefer if there was a review from someone with more authority regarding that sort of thing, but I think this should be harmless: this should only affect original DI preservation checks, since debugify does not add variables and should not be used on modules with existing variables, and I don't believe we depend on debugify's original DI preservation in any tests.

This revision is now accepted and ready to land.May 18 2021, 3:09 AM

LGTM - since this adds new output to debugify I would prefer if there was a review from someone with more authority regarding that sort of thing, but I think this should be harmless: this should only affect original DI preservation checks, since debugify does not add variables and should not be used on modules with existing variables, and I don't believe we depend on debugify's original DI preservation in any tests.

Actually this needs to be corrected, debugify does add variables, but I believe that this change isn't going to affect check-debugify, is that correct?

LGTM - since this adds new output to debugify I would prefer if there was a review from someone with more authority regarding that sort of thing, but I think this should be harmless: this should only affect original DI preservation checks, since debugify does not add variables and should not be used on modules with existing variables, and I don't believe we depend on debugify's original DI preservation in any tests.

Actually this needs to be corrected, debugify does add variables, but I believe that this change isn't going to affect check-debugify, is that correct?

This affects DebugifyMode::OriginalDIMode only, that checks preservation original debug info (-g generated) in optimizations -- https://llvm.org/docs/HowToUpdateDebugInfo.html#test-original-debug-info-preservation-in-optimizations.

@StephenTozer Thanks for the review!

Minor inline nits that can be skipped, plus a question about relying on DenseMap ordering.

Looks good, although I've no familiarity with the python.

llvm/lib/Transforms/Utils/Debugify.cpp
305–306

I know this isn't code changed by this patch; but can't we early-exit if there's no subprogram here? As far as I understand it, no subprogram means that there are no debugging intrinsics or !dbg attachments to be found in the function (and anything otherwise is a verifier error).

465

Given that DIFunctionsBefore is a DenseMap, and the order of iteration here is being preserved by the Bugs array, won't this be vulnerable to DenseMaps non-deterministic iteration order? (Easily fixed by making everything MapVectors).

542

The additions to this function seem very similar to the additions to collectDebugInfoMetadata, would it not benefit from a refactor / shared utilities? I don't know / feel enough about Debugify to know if there's some context I've missed.

llvm/unittests/Transforms/Utils/DebugifyTest.cpp
66–68

Mega nit, this is redundant given the loop below simply won't iterate over Dbgs if it's empty, no? (This might be a style thing).

djtodoro updated this revision to Diff 346659.May 20 2021, 1:54 AM
  • address a comment
djtodoro added inline comments.May 20 2021, 2:27 AM
llvm/lib/Transforms/Utils/Debugify.cpp
305–306

There could be a case where we have forgotten to do F->setSubprogram(SP) when doing some optimizations on F (more precisely, newF->setSubprogram(SP) -- but newF is representing the same function F). I've found a case like that when doing full LLVM-projects build.

465

We have used the DenseMap for each DI Metadata checking, since we thought the order of reported bugs doesn't matter; do you think we should care about it?

542

I guess there are existing lines as well that are very similar in both collectDebugInfoMetadata() and checkDebugInfoMetadata(). Can that refactoring be done as an incremental NFC patch?

llvm/unittests/Transforms/Utils/DebugifyTest.cpp
66–68

It makes sense :)

jmorse added inline comments.May 20 2021, 4:20 AM
llvm/lib/Transforms/Utils/Debugify.cpp
465

IMO yes: any kind of unexpected variation in the output is going to be fatal to anyone trying to script around this tooling. We should be conservative in what's outputted.

542

Works for me.

djtodoro added inline comments.
llvm/lib/Transforms/Utils/Debugify.cpp
465

OK, sure. This https://reviews.llvm.org/D102841 addresses that.

jmorse accepted this revision.May 20 2021, 5:16 AM

LGTM with D102841

This revision was landed with ongoing or failed builds.May 20 2021, 6:43 AM
This revision was automatically updated to reflect the committed changes.