This is an archive of the discontinued LLVM Phabricator instance.

[RFC][Debugify] Remove false positve debug location losses
Needs ReviewPublic

Authored by ntesic on Mar 29 2023, 3:33 AM.

Details

Summary

The problem of "false positive results" in Debugify reports was discussed at one of the LLVM Dev meetings. This RFC aims at solving this problem for reports about lost Debug Locations (instruction source locations).

Discourse discussion: https://discourse.llvm.org/t/debugify-false-positives-elimination/69614

There are several scenarios where debug locations are dropped, but should not be reported by Debugify as losses:

  • instruction's DebugLoc should be dropped if it cannot be unambiguously determined (according to How-to-update-debug-info document)
  • instruction's DebugLoc is copied from an other instruction (loss should not be reported for both instructions)

In my opinion, those cases should not be treated as losses, since DebugLocs are explicitly updated, but couldn't be reconstructed.

My proposal, in this patch, is to add a flag to the Instruction (droppedLocation), which should be set if the DebugLoc is explicitly set to empty for that instruction. Then, during Debugify analysis, we should not report DebugLoc loss if instruction's DebugLoc is explicitly updated (by checking the flag).

Proposed solution was tested by compiling clang-14 with -fverify-debuginfo-preserve. The results show around 65% of reported DebugLoc losses is eliminated.

Diff Detail

Event Timeline

ntesic created this revision.Mar 29 2023, 3:33 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 29 2023, 3:33 AM
ntesic requested review of this revision.Mar 29 2023, 3:33 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 29 2023, 3:33 AM
ntesic edited the summary of this revision. (Show Details)Mar 29 2023, 3:36 AM
ntesic added reviewers: djtodoro, Orlando, StephenTozer.

This looks useful and the results in the RFC are promising. I like the approach, however, I worry that adding a field to Instruction may be expensive - have you measured the performance and memory usage impact?

It may be that we need to come up with another mechanism for storing this info (nothing concrete comes to mind right now - I'm not terribly familiar with Instruction's internal state; I'm not sure if there are any spare bits hanging around that we might be able to use instead e.g. in SubclassData or something like that).

It seems like a good idea though. Have you managed to find any bugs (or perhaps parts of the compiler that should be marking DebugLocs as dropped) using this yet?

myhsu added a subscriber: myhsu.Mar 29 2023, 9:58 AM

I also feel like we should be a little more discreet on adding new Instruction fields. Assuming a debug location is set to (0,0) if and only if we intentionally drop a debug location, then isLocationDropped can simply be implemented as checking if debugloc is equal to (0,0) rather than adding a new Instruction field. Unfortunately, correct me if I'm wrong, I don't think that's the case. Namely, we need a separate state to represent removing debugloc completely without setting it to (0,0). A potential solution might be pointing debugloc to an unique, special DILocation representing such state when that happens, such that intentionally dropping debugloc is embodied by either (0,0) or that special DILocation. No additional instruction field is needed.

jmorse added a subscriber: jmorse.Apr 4 2023, 8:10 AM
ntesic added a comment.Apr 6 2023, 5:59 AM

Hi, thanks for your comments!

@Orlando
I haven't done any advanced performance measurements, but for the simple build time & size checking, I didn't detect any increase in time nor size.
Regarding bugs detection, in this patch I was focused on removing results from debugify reports that can lead to a dead-end investigation of debug-location drops.
According to how-to-update-debug-info document , the dropLocation() should be used whereever we want to explicitly remove debug-location. However, in most places
we still use setDebugLoc() with nullptr or empty DebugLoc parameter. It would take a lot of time to update all of these to use dropLocation().
Using spare bits in some of the existing bit vectors is definitely a better idea, I will investigate that further.

@myhsu
I think we are on the same page. There are situations where empty debug-loc is not intentionally dropped, so we need to detect the intentional drop in some way.
The (0,0) debug-loc is already a special kind of DILocation, representing an unknown source location, but doesn't imply that the location is implicitly or explicitly dropped.
Thanks for the idea.

Based on that, I am thinking about using (0,1) debug location if it is intentionally dropped, but I am not sure if some other parts of the debug-info infrastructure expect to see exactly (0,0) debug-loc.
What are the reviewers’ opinions about this idea?