Page MenuHomePhabricator

[llvm-dwarfdump][Statistics] Fix calculation of OffsetToFirstDefinition
Needs ReviewPublic

Authored by krisb on Oct 16 2019, 4:05 AM.

Details

Summary

While calculating a variable's scope OffsetToFirstDefinition is supposed
to be applied to the scope if a definition of the variable doesn't match
the beginning of the scope (this may happen if a variable is defined
in the middle of a function, for example).

But this also causes inaccuracies in debug info statistics especially in
optimized code.
This patch is a try to reduce the number of these inaccuracies by:

  • applying the offset to the local variable's scopes only as the scope of the function argument is the whole function,
  • applying the offset before calculating a variable's coverage bucket, otherwise, statistics look inconsistent,
  • while calculating the offset taking into account the fact, that the scope may be represented as a set of disconnected ranges.

Currently the offset is calculated as a distance between an address where the scope starts
(scope's LowPC) and an address where a variable is defined (variable's LowPC).
Consider these two cases:

Scope
 ranges
   [0x04,0x08),
   [0x0a,0x0e)
 Variable
   location
     [0x0a,0x0b)

Scope = (0x08 - 0x04) + (0x0e - 0x0a) = 8
OffsetToFirstDef = 0x0a - 0x04 = 6
If we adjust the scope by the offset, it will be 2 bytes instead of 4
as it should be.

Scope
  ranges
    [0x00,0x02),
    [0x0a,0x0e)
  Variable
    location
      [0x0a,0x0b)

Scope = (0x02 - 0x00) + (0x0e - 0x0a) = 6
OffsetToFirstDef = 0x0a - 0x00 = 10
The offset is greater than the scope itself, so this will be counted as an error
and the offsest will be set to 0.

Diff Detail

Event Timeline

krisb created this revision.Oct 16 2019, 4:05 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 16 2019, 4:05 AM
krisb edited the summary of this revision. (Show Details)Oct 16 2019, 4:07 AM
krisb added a project: debug-info.

Thanks for working on this.

Before digging into the change I see this will take a change in the test/tools/llvm-locstats/locstats.ll as well.

I'm confused - given the presence of noncontiguous ranges, why would the "first def" be an important property?

The existence of this operation is "iffy" to me - I'd expect a general solution that handles all the noncontiguous range chunks equally, so there wouldn't be a special case for the "first def".

krisb added a comment.Oct 16 2019, 4:22 PM

I'm confused - given the presence of noncontiguous ranges, why would the "first def" be an important property?

The existence of this operation is "iffy" to me - I'd expect a general solution that handles all the noncontiguous range chunks equally, so there wouldn't be a special case for the "first def".

Currently, we calculate the variable's coverage this way (if I understand the things correctly):

  1. Calculate the size of the whole scope - subprogram, lexical block, etc (either as a distance between scope's HighPC or LowPC or a sum of sizes of scope's ranges).
  2. Calculate the number of bytes where the variable is defined (by its location list).
  3. Calculate and substruct OffsetToFirstDefinition from the size of the scope.
  4. Divide the number of bytes where a variable is defined by the adjusted size of the scope.

OffsetToFirstDefinition is applied in the case the variable first definition address greater than scope's LowPC. I guess this is done to handle cases where the variable is defined not in the beginning of the scope (e.g. in the middle of a function).
Actually, I don't like this solution with OffsetToFirstDefinition because it makes the statistics more optimistic that it often is and hides some issues. But not sure alternatives are really better.
So, in this patch, I try to fix some issues related to calculation and usage of OffsetToFirstDefinition. The major one is that the OffsetToFirstDefinition is calculated as a distance between the variable's first definition address and scope's LowPC, which will be obviously wrong in cases like this

DW_TAG_lexical_block
  DW_AT_ranges  (0x00000000
     [0x0000000000000016, 0x0000000000000020)
     [0x0000000000000183, 0x00000000000002f4))

  DW_TAG_variable
    DW_AT_location  (0x0000018e
       [0x0000000000000183,  0x0000000000000197): DW_OP_reg0 RAX)

where we'd say the whole scope of the variable is covered.

I'm confused - given the presence of noncontiguous ranges, why would the "first def" be an important property?

The existence of this operation is "iffy" to me - I'd expect a general solution that handles all the noncontiguous range chunks equally, so there wouldn't be a special case for the "first def".

Currently, we calculate the variable's coverage this way (if I understand the things correctly):

  1. Calculate the size of the whole scope - subprogram, lexical block, etc (either as a distance between scope's HighPC or LowPC or a sum of sizes of scope's ranges).
  2. Calculate the number of bytes where the variable is defined (by its location list).
  3. Calculate and substruct OffsetToFirstDefinition from the size of the scope.
  4. Divide the number of bytes where a variable is defined by the adjusted size of the scope. OffsetToFirstDefinition is applied in the case the variable first definition address greater than scope's LowPC. I guess this is done to handle cases where the variable is defined not in the beginning of the scope (e.g. in the middle of a function).

Ah, OK - thanks for the explanation! Yeah, makes sense - and your point about it potentially being optimistic is valid, and yeah, I don't have any great ideas (no way to know where the variable's value /should/ start).

I'll leave you & other folks who have more context with locstats to continue with this review - sorry for the diversion & thanks for walking me through it!

Actually, I don't like this solution with OffsetToFirstDefinition because it makes the statistics more optimistic that it often is and hides some issues. But not sure alternatives are really better.
So, in this patch, I try to fix some issues related to calculation and usage of OffsetToFirstDefinition. The major one is that the OffsetToFirstDefinition is calculated as a distance between the variable's first definition address and scope's LowPC, which will be obviously wrong in cases like this

DW_TAG_lexical_block
  DW_AT_ranges  (0x00000000
     [0x0000000000000016, 0x0000000000000020)
     [0x0000000000000183, 0x00000000000002f4))

  DW_TAG_variable
    DW_AT_location  (0x0000018e
       [0x0000000000000183,  0x0000000000000197): DW_OP_reg0 RAX)

where we'd say the whole scope of the variable is covered.

avl added a subscriber: avl.Oct 18 2019, 2:57 AM

I think OffsetToFirstDefinition should not be taken into account at all.
i.e., coverage should be calculated against block scope, not against variable scope(which is unknown).

If we have the following case:

DW_TAG_lexical_block

DW_AT_ranges  (0x00000000
   [0x000000000000010, 0x0000000000000197))

DW_TAG_variable
  DW_AT_location  (0x0000018e
     [0x0000000000000183,  0x0000000000000197): DW_OP_reg0 RAX)

And then will create a fix which improves debug info in such a way:

DW_TAG_lexical_block

DW_AT_ranges  (0x00000000
   [0x000000000000010, 0x0000000000000197))

DW_TAG_variable
  DW_AT_location  (0x0000018e
     [0x0000000000000100,  0x0000000000000183): DW_OP_reg1 RBX)
     [0x0000000000000183,  0x0000000000000197): DW_OP_reg0 RAX)

In both cases reported statistic there would be 100%.
That hides original low coverage for that variable and makes invisible performed improvement.

krisb added a comment.Mon, Oct 21, 3:56 AM

@avl, thanks for the comment.

In D69027#1714203, @avl wrote:

I think OffsetToFirstDefinition should not be taken into account at all.
i.e., coverage should be calculated against block scope, not against variable scope(which is unknown).

Despite, we are not able to precisely identify where a variable starts, I still think that the heuristic we use is valuable. It makes a good sense to assume that in non-optimized code the 'first definition' in debug info comes close to the start of the variable's scope.
Furthermore, if to give up guessing about the variable's scope, we should also avoid calculating 'coverage'. I mean the value of range covered with debug info divided by the size of scope doesn't have a clear 'physical' meaning, IMO.

In D69027#1714203, @avl wrote:

That hides original low coverage for that variable and makes invisible performed improvement.

This will be shown in "vars scope bytes total" and "vars scope bytes covered" statistics.

avl added a comment.Tue, Oct 22, 4:35 PM

I am not the author of statistics in llvm-dwarfdump tool, and the following information could be irrelevant.
It needs to be checked by someone from the DebugInfo project.

It seems, calculating scope coverage is incomplete currently.
The original idea based on the fact that the variable's enclosing scope could be determined by subtracting OffsetFromFirstDefinition:

/// Total number of PC range bytes covered by DW_AT_locations.
unsigned ScopeBytesCovered = 0;
/// Total number of PC range bytes in each variable's enclosing scope,
/// starting from the first definition of the variable.
unsigned ScopeBytesFromFirstDefinition = 0;

But, if generated debug info is not complete, then start and end of address ranges could not match with the real beginning and end of variable`s address range. In that case, coverage calculated using OffsetFromFirstDefinition would always be 100%, as in the above example. It would be hard to guess that debug info is incomplete, looking at 100% coverage. Instead, comparing with scope range would show us that the coverage small(in case incomplete debug info generated). That could be used to discover incomplete debug info case.

Speaking of variable's enclosing scope: it is necessary to take into account not only OffsetFromFirstDefinition, which points into the start of the variable's enclosing scope. The end of the variable's address ranges should also be encountered.

Currently llvm-dwarfdump tool calculates scope`s and variable`s statistic separately:

unsigned VarScopeBytesCovered = 0;
/// Total number of PC range bytes in each variable's enclosing scope,
/// starting from the first definition of the variable (only for local
/// variables).
unsigned VarScopeBytesFromFirstDefinition = 0;
/// Total number of PC range bytes covered by DW_AT_locations with
/// the debug entry values (DW_OP_entry_value) (only for local variables).

So it is probably OK to have coverage for variable against overall scope(which allows us to see above problem) and to see coverage for variable against it`s enclosing scope, assuming start and end of address ranges of the variable are correct.

In that case it looks like coverage statistic could be calculated like this :

for DIE

from start_scope_pc till end_scope_pc 
    ScopeBytes += scope_pc_range
    ScopeBytesCovered += scope_pc_range INTERSECTION OF variable_pc_range

ScopeCoverage = ScopeBytes / ScopeBytesCovered;

from start_var_pc till end_var_pc 
    VarBytes += scope_pc_range 
    VarBytesCovered += scope_pc_range INTERSECTION OF variable_pc_range

VarCoverage = VarBytes / VarBytesCovered;
tools/llvm-dwarfdump/Statistics.cpp
300–301

adjustScopeToFirstDef() returns BytesInScope with already subtracted OffsetToFirstDefinition. That BytesInScope value later passed into collectLocStats(). Previously it was passed without subtracting OffsetToFirstDefinition. Is that correct?

aprantl added inline comments.Wed, Oct 23, 12:11 PM
tools/llvm-dwarfdump/Statistics.cpp
149

Can you add a /// comment that explains what this function is doing?

535

Since this is going to change the results in an incomparable way, can you please also bump the version number?

krisb updated this revision to Diff 226567.Sun, Oct 27, 4:03 AM
krisb marked 4 inline comments as done.

Bumped version 4, updated more tests, added comment for adjustScopeToFirstDef().

krisb added a comment.Sun, Oct 27, 4:04 AM
In D69027#1718400, @avl wrote:

Speaking of variable's enclosing scope: it is necessary to take into account not only OffsetFromFirstDefinition, which points into the start of the variable's enclosing scope. The end of the variable's address ranges should also be encountered.

I have my patch based on two assumptions:

  1. We only care about lexically (or statically) scoped languages. Otherwise detecting a variable's scope is very problematic.
  2. In a lexically scoped language the end of a variable's scope should match the end of the enclosing lexical scope.

That's why I a bit confused about why would we need to make some adjustments to the end of a range. Could you kindly clarify what did you mean?

In D69027#1718400, @avl wrote:

So it is probably OK to have coverage for variable against overall scope(which allows us to see above problem) and to see coverage for variable against it`s enclosing scope, assuming start and end of address ranges of the variable are correct.

If the other metric is valuable, I suggest implementing it in a separate patch. But could you give an example when calculating coverage against full scope provides additional knowledge which does not follow from currently existing metrics?

tools/llvm-dwarfdump/Statistics.cpp
300–301

From my point of view, yes, that's correct.
I think, we should either apply OffsetToFirstDefinition every time we calculating any 'coverage' metrics or do not apply it at all. Or as another option - explicitly say which metrics are calculated against adjusted scope, and which against the full one. Currently, we have a mix, which hardens interpreting the statistics.
In that patch, I chose the first option and applied OffsetToFirstDefinition everywhere.

535

Forgot about this. Done, thanks!

avl added a comment.Mon, Oct 28, 8:03 AM

I have my patch based on two assumptions:

We only care about lexically (or statically) scoped languages. Otherwise detecting a variable's scope is very problematic.
In a lexically scoped language the end of a variable's scope should match the end of the enclosing lexical scope.

That's why I a bit confused about why would we need to make some adjustments to the end of a range. Could you kindly clarify what did you mean?

Current solution with showing coverage for VarScope - OffsetToFirstDefinition looks like a mix of two concepts:

  1. variable scope. Block where a variable is defined.
  2. variable lifetime. The region from definition till last use.

I think the explicit separation of these two things would allow understanding statistics better. See the following example of current behavior.

If the other metric is valuable, I suggest implementing it in a separate patch. But could you give an example when calculating coverage against full scope provides additional knowledge which does not follow from currently existing metrics?

suppose we have such case with incomplete debug info:

DW_TAG_lexical_block

DW_AT_ranges  (0x00000000
   [0x000000000000010, 0x0000000000000197))

DW_TAG_variable
  DW_AT_location  (0x0000018e
     [0x0000000000000183,  0x0000000000000197): DW_OP_reg0 RAX)

And then will create a fix which improves debug info in such a way:

DW_TAG_lexical_block

DW_AT_ranges  (0x00000000
   [0x000000000000010, 0x0000000000000197))

DW_TAG_variable
  DW_AT_location  (0x0000018e
     [0x0000000000000100,  0x0000000000000183): DW_OP_reg1 RBX)
     [0x0000000000000183,  0x0000000000000197): DW_OP_reg0 RAX)

In both cases reported coverage there would be 100%.
That hides original low coverage for that variable and makes invisible performed improvement.
Change in coverage looks like this: 100%->100%. If it instead would be 20%->30% then result of changes would be seen.

In D69027#1723448, @avl wrote:

I have my patch based on two assumptions:

We only care about lexically (or statically) scoped languages. Otherwise detecting a variable's scope is very problematic.
In a lexically scoped language the end of a variable's scope should match the end of the enclosing lexical scope.

That's why I a bit confused about why would we need to make some adjustments to the end of a range. Could you kindly clarify what did you mean?

Current solution with showing coverage for VarScope - OffsetToFirstDefinition looks like a mix of two concepts:

  1. variable scope. Block where a variable is defined.
  2. variable lifetime. The region from definition till last use.

    I think the explicit separation of these two things would allow understanding statistics better. See the following example of current behavior.

I'm not sure this answers the question which was about your statement: "The end of the variable's address ranges should also be encountered."

& I'm not sure those are two ideal concepts.

In C++ & other scope-based languages, the scope of a variable is from the point of declaration to the end of the enclosing scope. In the DWARF the first point there is lost - so we make a heuristic guess at it.

The region from definition till last use would also be a guess - the DWARF doesn't have information about the last use. Only about the last live location (which may be past the last use).

We know that the variable should be live until the end of the enclosing scope (as per the language) but we don't know where it should start - so there are (if I'm unedrstanding this discussion correctyl) two statistics - one for each end of the heuristic range: one measuring from the start of the enclosing scope (the earliest the variable could be declared) and the other measuring from the first reported location (that's the statistic that's being discussed/fixed here).

I don't think pruning the end of the latter list improves the quality of the statistic - since scope based languages, as @krisb said, have the variable accessible (at the language level) to the end of the enclosing scope, so we know that coverage bytes missing in that tail are missed opportunities for greater variable coverage.

If the other metric is valuable, I suggest implementing it in a separate patch. But could you give an example when calculating coverage against full scope provides additional knowledge which does not follow from currently existing metrics?

suppose we have such case with incomplete debug info:

DW_TAG_lexical_block

DW_AT_ranges  (0x00000000
   [0x000000000000010, 0x0000000000000197))

DW_TAG_variable
  DW_AT_location  (0x0000018e
     [0x0000000000000183,  0x0000000000000197): DW_OP_reg0 RAX)

And then will create a fix which improves debug info in such a way:

DW_TAG_lexical_block

DW_AT_ranges  (0x00000000
   [0x000000000000010, 0x0000000000000197))

DW_TAG_variable
  DW_AT_location  (0x0000018e
     [0x0000000000000100,  0x0000000000000183): DW_OP_reg1 RBX)
     [0x0000000000000183,  0x0000000000000197): DW_OP_reg0 RAX)

In both cases reported coverage there would be 100%.
That hides original low coverage for that variable and makes invisible performed improvement.
Change in coverage looks like this: 100%->100%. If it instead would be 20%->30% then result of changes would be seen.

That's why there's the other statistic, which is bytes of enclosing scope coverage rather than the version that trims the start. ("vars scope bytes covered" / "vars scope bytes total")

In D69027#1723448, @avl wrote:

If the other metric is valuable, I suggest implementing it in a separate patch. But could you give an example when calculating coverage against full scope provides additional knowledge which does not follow from currently existing metrics?

suppose we have such case with incomplete debug info:

DW_TAG_lexical_block

DW_AT_ranges  (0x00000000
   [0x000000000000010, 0x0000000000000197))

DW_TAG_variable
  DW_AT_location  (0x0000018e
     [0x0000000000000183,  0x0000000000000197): DW_OP_reg0 RAX)

And then will create a fix which improves debug info in such a way:

DW_TAG_lexical_block

DW_AT_ranges  (0x00000000
   [0x000000000000010, 0x0000000000000197))

DW_TAG_variable
  DW_AT_location  (0x0000018e
     [0x0000000000000100,  0x0000000000000183): DW_OP_reg1 RBX)
     [0x0000000000000183,  0x0000000000000197): DW_OP_reg0 RAX)

In both cases reported coverage there would be 100%.
That hides original low coverage for that variable and makes invisible performed improvement.
Change in coverage looks like this: 100%->100%. If it instead would be 20%->30% then result of changes would be seen.

Doesn't the absolute number of covered bytes (i.e. ScopeBytesCovered) suite this purpose? Yes, we will see no change in coverage, but the absolute number will be changed from 14 to 97 bytes, which may be considered as an improvement. Does it make sense to you?

We know that the variable should be live until the end of the enclosing scope (as per the language) but we don't know where it should start - so there are (if I'm unedrstanding this discussion correctyl) two statistics - one for each end of the heuristic range: one measuring from the start of the enclosing scope (the earliest the variable could be declared) and the other measuring from the first reported location (that's the statistic that's being discussed/fixed here).

In the master, we indeed have two statistic types: one calculates a number of bytes covered by variable's DW_AT_location (i.e ScopeBytesCovered) against a number of bytes of variable's scope from first definition (i.e ScopeBytesFromFirstDefinition); another - calculates 'coverage bucket' per variable dividing the number of variable's covered bytes by number of bytes in full scope, w/o adjustment to the first definition (see collectLocStats()).
But I'm changing the way we calculate 'coverage bucket' in this patch to 'dividing by the number of bytes in the scope adjusted to the first definition' for consistency purpose. It's not clear to me why 'coverage buckets' is being measured this way and I'm not sure it was intentional.

Just to summarize the discussion above. If I understand correctly (Alexey @avl, please, correct me if I'm wrong here), Alexey suggests either

  • to replace all the statistics that are calculated against the scope adjusted to the first definition with the statistics against full scope.

or

  • to add new statistics that are calculated against full scope to provide additional opportunities to detect/track changes in debug info coverage.

The first option makes this patch meaningless. But, IMO, with this approach statistics will no longer answer the question about the quality of debug info. This is why I do not agree with it.
Regarding the second option, I do not have a strong opinion. But I think it's better to consider it not in the scope of this patch.

avl added a comment.Tue, Oct 29, 4:30 PM

I'm not sure this answers the question which was about your statement: "The end of the variable's address ranges should also be encountered."

& I'm not sure those are two ideal concepts.

In C++ & other scope-based languages, the scope of a variable is from the point of declaration to the end of the enclosing scope. In the DWARF the first point there is lost - so we make a heuristic guess at it.

The region from definition till last use would also be a guess - the DWARF doesn't have information about the last use. Only about the last live location (which may be past the last use).

right.

We know that the variable should be live until the end of the enclosing scope (as per the language) but we don't know where it should start - so there are (if I'm unedrstanding this discussion correctyl) two statistics - one for each end of the heuristic range: one measuring from the start of the enclosing scope (the earliest the variable could be declared) and the other measuring from the first reported location (that's the statistic that's being discussed/fixed here).

At the present moment first one(the earliest the variable could be declared) is not calculated(it was calculated partially in original code). My main concern is that we need this.

I don't think pruning the end of the latter list improves the quality of the statistic - since scope based languages, as @krisb said, have the variable accessible (at the language level) to the end of the enclosing scope, so we know that coverage bytes missing in that tail are missed opportunities for greater variable coverage.

There are three views on coverage discussed here :

  1. range from the start of scope(the earliest the variable could be declared position) till the end of the scope.

    This shows how debug info cover scope from the source code where the variable is declared. I think we need this view since it allows for tracking all changes with coverage, and is based on known scope boundaries.
  1. range from the variable declaration point(first reported location) till the end of the scope.

    This shows how debug info covers variable enclosing scope from the source code. It could miss some debug info changes since we use a heuristic to decide where the variable declaration is located.
  1. range for a variable lifetime - from first reported location till last reported location.

    This assumed to show how debug info covers the resulting code. i.e., whether all resulting binary instructions covered by debug info. This view could be useful, especially for optimized code. Let`s see the following example.
int foo ( int p1 ) {
    int ll = p1 + 5;

    printf("\n ll %d", ll );

    .. other code
}

Variable "ll" declared at the very start of the scope. It is visible until the end of the scope. But when compiled with optimization, then that variable would be allocated into the register. As a result, it would be visible during a smaller range. Debug info covers all that range:

0x0000002a:   DW_TAG_subprogram
                DW_AT_low_pc    (0x0000000000000000)
                DW_AT_high_pc   (0x0000000000000027)

0x00000056:     DW_TAG_variable
                  DW_AT_location        (0x00000036
                     [0x0000000000000006,  0x0000000000000012): DW_OP_reg4 RSI)

0x00000074:     NULL

If we would calculate coverage using #2, then the coverage would be 36%.
But all instructions which relate to the variable "ll" covered by debug info.
So, in this case, debug info covers 100% existing code.
Thus #3 would show real coverage in this case. That's the idea.
But the problem here is that the start and end of the variable range reported in DWARF
could not match with real instructions. So #3 is also some heuristic.

Shortly, I think we need #1, and probably #2, #3.

That's why there's the other statistic, which is bytes of enclosing scope coverage rather than the version that trims the start. ("vars scope bytes covered" / "vars scope bytes total")

My understanding is that there is no statistic which does not trim the start:

BytesInScope -= OffsetToFirstDefinition;
// Turns out we have a lot of ranges that extend past the lexical scope.
GlobalStats.ScopeBytesCovered += std::min(BytesInScope, BytesCovered);
GlobalStats.ScopeBytesFromFirstDefinition += BytesInScope;
GlobalStats.VarScopeBytesCovered += std::min(BytesInScope, BytesCovered);
GlobalStats.VarScopeBytesFromFirstDefinition += BytesInScope;

collectLocStats() from the original version receives not trimmed scope, but it looks like a error.

avl added a comment.Tue, Oct 29, 5:03 PM

Doesn't the absolute number of covered bytes (i.e. ScopeBytesCovered) suite this purpose? Yes, we will see no change in coverage, but the absolute number will be changed from 14 to 97 bytes, which may be considered as an improvement. Does it make sense to you?

I think absolute number of covered bytes is not enough here. There should be explicit change in coverage coefficient.

(Alexey @avl, please, correct me if I'm wrong here), Alexey suggests either

my suggestion is that we need not trimmed coverage for

GlobalStats.ScopeBytesCovered += std::min(BytesInScope, BytesCovered);
GlobalStats.ScopeBytes += BytesInScope;
GlobalStats.ParamScopeBytesCovered += std::min(BytesInScope, BytesCovered);
GlobalStats.ParamScopeBytes += BytesInScope;
GlobalStats.VarScopeBytesCovered += std::min(BytesInScope, BytesCovered);
GlobalStats.VarScopeBytes += BytesInScope;

and for collectLocStats.

As to the #2 and #3 - they could probably be done later.

I was going to say that Alexey's diversion was derailing this review a bit - but as I've thought about it some more, I think there are some trickier problems with this idea:

The order of instructions and blocks is non-specific, we don't really know which is the "first" block in a scope - there's no reason that the execution order of the blocks need be the same as the address order, or the order they are emitted into the range list (which doesn't have to be in either order - so there's 3 possible orderings of blocks (& of course the "scope" ordering might not even be a unique ordering, it could start in the middle of a loop, different bits reachable in different orders, etc))

I guess even for a contiguous range of both the scope and the location, the basic blocks might be out of order (the earlier/lower instructions might represent later execution).
eg: you could have a scope of [10, 20) and a location that is [10, 5) + [7, 10) - but those 2 missing bytes (bytes 5 and 6) could be due to the start of the variable's lifetime (ie: they could be instructions related to code that appears before the variable is declared in the source).

And you could be in that situation where some basic block reordering means the "tail" of the scope, in machine instructions is actually code related to the start of the scope & prior to the variable's declaration.

So, yeah, I'm starting to feel like this "trim the start of the scope" bit is perhaps not a very good heuristic. You could trim /every/ range chunk separately - but you'd want to trim both the start and the end.

All this is to say, I'm coming around to the idea that it's difficult to impossible to have a metric that will even give you a vaguely confident number about whether we've described a variable's value across its entire scope.

But I think this is more @aprantl 's wheelhouse, since he created these statistics - so, Adrian, it'd be super useful if you could chime in here on what you think is useful to you/what you want to see in these features.

I see the primary value of this statistic to allow us to see change — then a human has to make judgement what that change means. Regardless which choice we make, we may get a false signal from a compiler change that reorders basic blocks, but we'll get valuable signal when we're testing the ramifications of a debug-info-only change to the compiler. As long as we document what the statistics measure, I think we just need to pick something and run with it.

avl added a comment.Thu, Oct 31, 5:44 AM

I see the primary value of this statistic to allow us to see change — then a human has to make judgement what that change means.

That is exactly the reason why I propose to drop calculation of OffsetToFirstDefinition. For the already discussed example, currently visible change would be 100%->100% For the case without OffsetToFirstDefinition there would be something like 20%->30% which makes change visible.

I see the primary value of this statistic to allow us to see change — then a human has to make judgement what that change means. Regardless which choice we make, we may get a false signal from a compiler change that reorders basic blocks, but we'll get valuable signal when we're testing the ramifications of a debug-info-only change to the compiler. As long as we document what the statistics measure, I think we just need to pick something and run with it.

Sure - then I'd suggest that a good statistic to choose would be "bytes covered by location lists" (which is already there) and maybe, if you want a % one, "bytes covered by location lists" / "bytes of enclosing scopes" (counting each scope once for each variable within it - or doing the loc-list-intersects-with-enclosing-scope earlier & just tracking the relative numbers from that point).

krisb added a comment.Thu, Oct 31, 3:51 PM

@dblaikie thanks, David! This is definitely needed to be kept in mind.

In D69027#1728577, @avl wrote:

I see the primary value of this statistic to allow us to see change — then a human has to make judgement what that change means.

That is exactly the reason why I propose to drop calculation of OffsetToFirstDefinition. For the already discussed example, currently visible change would be 100%->100% For the case without OffsetToFirstDefinition there would be something like 20%->30% which makes change visible.

Regarding tracking changes and their visibility, dropping calculation of OffsetToFirstDefinition is not a panacea, especially if we talking about 'coverage buckets'. In cases when we have a large enclosing scope and a relatively small variable's scope (or coverage), we will see no changes because they will be less than 10% to exceed the bucket's threshold.
So, the only statistic we can rely on is the absolute number of bytes covered by variable's DW_AT_locations. And maybe on the number of bytes covered divided by the number of bytes in an enclosing scope. Doesn't this make even significant changes not noticeable?

krisb added a comment.Thu, Oct 31, 4:05 PM

I see the primary value of this statistic to allow us to see change — then a human has to make judgement what that change means. Regardless which choice we make, we may get a false signal from a compiler change that reorders basic blocks, but we'll get valuable signal when we're testing the ramifications of a debug-info-only change to the compiler. As long as we document what the statistics measure, I think we just need to pick something and run with it.

Sure - then I'd suggest that a good statistic to choose would be "bytes covered by location lists" (which is already there) and maybe, if you want a % one, "bytes covered by location lists" / "bytes of enclosing scopes" (counting each scope once for each variable within it - or doing the loc-list-intersects-with-enclosing-scope earlier & just tracking the relative numbers from that point).

Looks like I missed this message while writing my previous one, sorry.
I strongly agree here. As I said at the beginning of the discussion if to give up guessing about the variable's scope, the only statistic that makes sense is "bytes covered by location lists".
But what about others we already have? Should they be considered as 'not useful' or 'not reliable'? I still think that measurements against some kind of 'adjusted' scope can be quite useful.

@dblaikie thanks, David! This is definitely needed to be kept in mind.

In D69027#1728577, @avl wrote:

I see the primary value of this statistic to allow us to see change — then a human has to make judgement what that change means.

That is exactly the reason why I propose to drop calculation of OffsetToFirstDefinition. For the already discussed example, currently visible change would be 100%->100% For the case without OffsetToFirstDefinition there would be something like 20%->30% which makes change visible.

Regarding tracking changes and their visibility, dropping calculation of OffsetToFirstDefinition is not a panacea, especially if we talking about 'coverage buckets'. In cases when we have a large enclosing scope and a relatively small variable's scope (or coverage), we will see no changes because they will be less than 10% to exceed the bucket's threshold.
So, the only statistic we can rely on is the absolute number of bytes covered by variable's DW_AT_locations. And maybe on the number of bytes covered divided by the number of bytes in an enclosing scope. Doesn't this make even significant changes not noticeable?

Depends on the definition of significance - if a change only introduces a few small changes in total variable coverage, is that significant?

But a different bucketing scheme could be used maybe based on logarithmic buckets of scope or location coverage byte count. But I'm not sure that's too important - I wouldn't expect short ranges to have very different location characteristics to longer ranges (long ranges will still get moved into registers and otehr things like in short ranges, etc) - so I'd expect any significant change to show up more evenly across short and long ranges, so overall % growth to be fairly evenly distributed.

I think that as long as we can clearly document the statistics we measure, we can keep counting that. So, my point is, we can calculate the buckets against the variable's scope, likewise we can add a field into the statistics reporting the overall stats against the adjustment (approximate variable's life).

krisb added a comment.Fri, Nov 1, 4:09 AM

Currently (in the current master), we have the following statistics related to the discussion:

  1. The total number of bytes covered by DW_AT_location over all variables. We also have separate statistics for local vars only, params only and entry values for vars and params.
  2. The total number of bytes in variable's scopes over all variables. Variable's scope means a scope adjusted to the lowerest address in variables' DW_AT_location. Plus, separate statistics for local vars only and params only.
  3. The number of variables in each coverage bucket, where the coverage bucket calculated as 'the number of bytes covered by a variable's DW_AT_location' / 'the number of bytes in variable's enclosing scope'. Variable's enclosing scope means a full scope (e.g. full subprogram, lexical block, etc.) without any adjustments. There are also separate statistics for local vars and params, and all the three where entry values are excluded.

I propose to move to the following:

  1. Keep as is.
  2. Keep but:
    • document properly with all the assumption and limitations, as 'it's known that the adjusted scope might not accurately represent variable's scope, but we still find it useful',
    • do not calculate it for params,
    • apply fixes proposed in this patch (i.e. account non-contiguous ranges while calculating OffsetToFirstDefinition).
    • TODO: there might be a field for further investigation that can improve the heuristic.
  3. Keep but:
    • switch to 'the number of bytes covered by a variable's DW_AT_location' / 'the number of bytes in variable's adjusted scope' for local vars,
    • document everything related to the 'adjusted' scope.
  4. Add new statistics: the total number of bytes in variable's enclosing scopes over all variables.

What do you think?

avl added a comment.Fri, Nov 1, 5:01 AM

if you want a % one, "bytes covered by location lists" / "bytes of enclosing scopes" (counting each scope once for each variable within it - or doing the loc-list-intersects-with-enclosing-scope earlier & just tracking the relative numbers from that point).

Probably, I misunderstood above. Correct me if I wrong, please:

"bytes covered by location lists" - includes all variables.
"bytes of enclosing scopes" - includes all scopes and each scope counted once for each variable within it.

correct ?

In that case, if we have one scope which has 10 variables with locations from start to end, then calculated percentage number would be 1000%.

"bytes covered by location lists" / "bytes of enclosing scopes" * 100 = 1000%

To have resulting percentage number inside 100% we need to count scope each time for each variable.

for each variable :

from start_scope_pc till end_scope_pc 
    ScopeBytes += scope_pc_range
    ScopeBytesCovered += scope_pc_range INTERSECTION OF variable_loc_list_range

ScopeCoverage = ScopeBytesCovered / ScopeBytes * 100;
avl added a comment.Fri, Nov 1, 6:00 AM

What do you think?

I think we need to have(for Statistics.cpp):

ScopeBytesCovered -> includes all scope bytes intersected with variable loc list. For all variables and parameters.
ScopeBytesFromFirstDefinition -> renamed into ScopeBytes, and includes all scope bytes counted for each variable separately. For all variables and parameters.

ParamScopeBytesCovered -> same as ScopeBytesCovered but for parameters only.
ParamScopeBytesFromFirstDefinition -> same as ScopeBytes but for parameters only.

VarScopeBytesCovered -> same as ScopeBytesCovered but for variables only.
VarScopeBytesFromFirstDefinition -> same as ScopeBytes but for variables only.

buckets are calculated against not trimmed(not adjusted) scope(that is the same value which is calculated for ScopeBytes for single variable).

documentation is updated accordingly.

if we would like to calculate coverage for trimmed/adjusted/approximate_variable's_life variable scope then we could add additional statistic fields.

@krisb Thanks for summarizing the things!

I think as well that it will be good (and useful) to add new fields (buckets) for the local variables statistics with adjusted scope, and keep the existing buckets calculated against the enclosing scope. And then, we could add a new option into the llvm-locstats (http://llvm.org/docs//CommandGuide/llvm-locstats.html) to handle it properly.

Let us know what @aprantl and @vsk think about the ideas. :)

vsk added a comment.Mon, Nov 4, 1:22 PM
In D69027#1729978, @avl wrote:

if you want a % one, "bytes covered by location lists" / "bytes of enclosing scopes" (counting each scope once for each variable within it - or doing the loc-list-intersects-with-enclosing-scope earlier & just tracking the relative numbers from that point).

[snip ...]
To have resulting percentage number inside 100% we need to count scope each time for each variable.

@avl I believe that's roughly what @dblaikie suggested, so we should be on the same page.

Currently (in the current master), we have the following statistics related to the discussion:

  1. The total number of bytes covered by DW_AT_location over all variables. We also have separate statistics for local vars only, params only and entry values for vars and params.
  2. The total number of bytes in variable's scopes over all variables. Variable's scope means a scope adjusted to the lowerest address in variables' DW_AT_location. Plus, separate statistics for local vars only and params only.
  3. The number of variables in each coverage bucket, where the coverage bucket calculated as 'the number of bytes covered by a variable's DW_AT_location' / 'the number of bytes in variable's enclosing scope'. Variable's enclosing scope means a full scope (e.g. full subprogram, lexical block, etc.) without any adjustments. There are also separate statistics for local vars and params, and all the three where entry values are excluded.

    I propose to move to the following:
  1. Keep as is.
  2. Keep but:
    • document properly with all the assumption and limitations, as 'it's known that the adjusted scope might not accurately represent variable's scope, but we still find it useful',
    • do not calculate it for params,
    • apply fixes proposed in this patch (i.e. account non-contiguous ranges while calculating OffsetToFirstDefinition).
    • TODO: there might be a field for further investigation that can improve the heuristic.
  3. Keep but:
    • switch to 'the number of bytes covered by a variable's DW_AT_location' / 'the number of bytes in variable's adjusted scope' for local vars,
    • document everything related to the 'adjusted' scope.
  4. Add new statistics: the total number of bytes in variable's enclosing scopes over all variables.

    What do you think?

+ 1. I think the first set of statistics would give a decent overview of debug info quality, and the second (without any adjustments to enclosing scope sizes) would be more useful for evaluating individual changes. I worry that only having the second set of statistics would paint an unrealistically bleak picture.

+ 1. I think the first set of statistics would give a decent overview of debug info quality, and the second (without any adjustments to enclosing scope sizes) would be more useful for evaluating individual changes. I worry that only having the second set of statistics would paint an unrealistically bleak picture.

I'm not sure the bleakness is as much of a problem (but I didn't implement/don't personally use the statistics at all) - like all the other measures, this would be a relative one, not one where "100%" was a goal/meaningful (& the only reason to view it as a % would be to understand progress in a relative sense, I guess? Though measuring it on exactly the same code as an absolute value would be fine and represent progress too)

If we are going to keep adjusted scope values - I'd love to hear a proposal that explains how to do so in a way that makes sense to me in the face of arbitrary basic block order - so far as I can think of, the only "adjusted scope bytes" that would make sense to me would be one that looks at each basic block (which would required disassembly to even identify, unfortunately) and has an adjusted scope of "first byte with a valid location description to last byte with a valid location description" within a scope subrange within a basic block. Anything else seems not very meaningful to me due to basic block order being unspecified (& maybe based on profile driven optimization, etc).

avl added a comment.Wed, Nov 6, 5:46 AM

I'm not sure the bleakness is as much of a problem (but I didn't implement/don't personally use the statistics at all) - like all the other measures, this would be a relative one, not one where "100%" was a goal/meaningful (& the only reason to view it as a % would be to understand progress in a relative sense, I guess? Though measuring it on exactly the same code as an absolute value would be fine and represent progress too)

It seems that %-coefficient was originally created not for understanding progress.
The idea is to see completeness. i.e., if coverage for all variables matches with coverage in source code, then we have a complete debug info. This coefficient _is_not_precise_. But it shows direction.

If we have a variable declared at the very beginning of the scope, then we assume that the coverage for that variable should be 100%. If we see %-coefficient = 30%, then it signals us that debug info is _probably_ incomplete. The same information could be understood from absolute values. But to see this, we need to make the same calculations, as done for coefficient, for absolute values.

Another thing is that absolute values are changed too often. Let's have a variable declared at the start of scope. That means its coverage(in source code) is 100%. It's calculated coverage would be 200 bytes. The scope size is also 200 bytes. After some optimization size of scope would be increased till 250 bytes. Calculated coverage for variables would be 250 bytes(debug info covers whole scope). Thus we would see 200->250 bytes increase. But, in both cases, debug info equally complete. If we would see 100%->100% in that case, then we would see "no changes, everything is OK." If we see 200->250, then we would need additionally check that case(i.e. this is false alarm).

From that point of view, having %-coefficient as a completion signal looks useful.
In that case, it is essential to have a coefficient in the range of 0%-100%.

If we are going to keep adjusted scope values - I'd love to hear a proposal that explains how to do so in a way that makes sense to me in the face of arbitrary basic block order - so far as I can think of, the only "adjusted scope bytes" that would make sense to me would be one that looks at each basic block (which would required disassembly to even identify, unfortunately) and has an adjusted scope of "first byte with a valid location description to last byte with a valid location description" within a scope subrange within a basic block. Anything else seems not very meaningful to me due to basic block order being unspecified (& maybe based on profile driven optimization, etc).

I also think that calculating adjusted scope just by trimming to the first reported location is not the right way. Because the real variable scope is unknown and because variable lifetime in optimized code could not match with its scope. Calculating coverage for a variable lifetime would be the most useful metric in that case(though it requires knowledge about real instructions).

krisb added a comment.Sat, Nov 9, 11:56 AM

When I started to work on the statistics, I also thought that the percentage buckets are to show completeness of the debug info. Unfortunately, due to BB reordering this information, sometimes not just imprecise, but rather terribly wrong. And in general case, we can't even guarantee any precision. I agree with @dblaikie; we need a way to calculate adjusted scope correctly regardless of basic block reordering to justify the notion of the adjusted scope itself. Otherwise, I'd prefer to reduce maintenance costs and to calculate debug info coverage against the full enclosing scope. In such a case, only changes in percentages would make sense, but at least this information won't be misleading. So, I tend to abandon this patch and proceed with another one that changes all the calculations against an adjusted scope to calculations against a full enclosing scope (if there are no strong objections).

There is another thought about the adjusted scope, though. It's clearly out of the scope of this patch, but it's probably worth to discuss it here. I just found that the dwarf specification (both v4 and v5) provides the DW_AT_start_scope attribute, which represents exactly the 'adjusted' scope. So, as a possible solution we can move all the calculations of the adjusted scope to the compiler (but I have no idea if the compiler can emit something precise enough here and it's worth to do for statistic's purpose). The usage of the attribute was discussed in the context of the dwarflint --stats tool, but it was more than nine years ago: https://lists.fedorahosted.org/pipermail/elfutils-devel/2010-July/001498.html. As I can see, neither gcc nor clang produces this attribute at the moment.

When I started to work on the statistics, I also thought that the percentage buckets are to show completeness of the debug info. Unfortunately, due to BB reordering this information, sometimes not just imprecise, but rather terribly wrong. And in general case, we can't even guarantee any precision. I agree with @dblaikie; we need a way to calculate adjusted scope correctly regardless of basic block reordering to justify the notion of the adjusted scope itself. Otherwise, I'd prefer to reduce maintenance costs and to calculate debug info coverage against the full enclosing scope. In such a case, only changes in percentages would make sense, but at least this information won't be misleading. So, I tend to abandon this patch and proceed with another one that changes all the calculations against an adjusted scope to calculations against a full enclosing scope (if there are no strong objections).

There is another thought about the adjusted scope, though. It's clearly out of the scope of this patch, but it's probably worth to discuss it here. I just found that the dwarf specification (both v4 and v5) provides the DW_AT_start_scope attribute, which represents exactly the 'adjusted' scope. So, as a possible solution we can move all the calculations of the adjusted scope to the compiler (but I have no idea if the compiler can emit something precise enough here and it's worth to do for statistic's purpose). The usage of the attribute was discussed in the context of the dwarflint --stats tool, but it was more than nine years ago: https://lists.fedorahosted.org/pipermail/elfutils-devel/2010-July/001498.html. As I can see, neither gcc nor clang produces this attribute at the moment.

Yeah, that could be handy - but probably pretty complicated to propagate through the compiler & only have limited benefits to users (in normal debugger scenarios, rather than stats) - the difference between "this variable is out of scope" (or using another variable that would otherwise have been shadowed - that's pretty important) and "this variable's value is optimized out". The complexity of the implementation might be why it's not been implemented - possible that an implementation might be justifiable in a context of bigger improvements like is_stmt accuracy or sub-expression source range grouping, or other ideas that've been thrown around from time to time.