# [llvm-dwarfdump][Stats] Clean upClosedPublicActions

Authored by djtodoro on Apr 9 2020, 4:01 AM.

# Details

Reviewers
 aprantl vsk jhenderson
Commits
rG0a4defe8c877: [llvm-dwarfdump][Stats] Clean up
Summary

-Clean up the source code
-Refactor the JSON fields
-Fix the test cases

# Diff Detail

### Event Timeline

djtodoro created this revision.Apr 9 2020, 4:01 AM
llvm/test/tools/llvm-dwarfdump/X86/locstats.ll
5

comparing this to variables with 0% of its scope covered, should we go even further in terms of coming up with a more descriptive field name?

djtodoro marked an inline comment as done.Apr 10 2020, 5:19 AM
llvm/test/tools/llvm-dwarfdump/X86/locstats.ll
5

I strongly agree.

How about total bytes in each local var's scope? And to changing all the similar fields respectively, in that way.

llvm/test/tools/llvm-dwarfdump/X86/locstats.ll
5

Let's try to be systematic, and not too verbose, but still precise. Here's a suggestion using a TeX-style syntax that I happen to feel comfortable with, but I'm sure others' opinions will vary:

"local vars scope bytes covered" ->
"\sum_{all local vars} #bytes in parent scope covered by DW_AT_location"
"\sum_{all local vars} #bytes in parent scope"
"\sum_{all params} #bytes in parent scope"
...

"total variables procesed by location statistics" ->
"#local vars + #params"

and not "\sum_{all local vars} 1" :-)

"#local vars with 0% of parent scope covered by DW_AT_location"
"#params \ entry values with [60%,70%) of parent scope covered by DW_AT_location"

Let me know what you think!

djtodoro marked an inline comment as done.Apr 21 2020, 5:45 AM

I've been away for a few days, sorry the late response.

llvm/test/tools/llvm-dwarfdump/X86/locstats.ll
5

I am also fan of the TeX syntax, and strong +1 for this.
OTOH, I think we should document this somehow, at least mentioning that the JSON fields are written in the spirit of the TeX. Please let me know if you agree with that.

djtodoro updated this revision to Diff 258972.Apr 21 2020, 5:47 AM

-refactoring the JSON fields in the spirit of the TeX syntax

llvm/test/tools/llvm-dwarfdump/X86/locstats.ll
5

To me, the logical thing to do would be to add a section to the llvm-dwarfdump CommandGuide documentation that describes the format.

@jhenderson Thanks for the comment.

djtodoro updated this revision to Diff 259516.Apr 23 2020, 4:16 AM

djtodoro updated this revision to Diff 259551.Apr 23 2020, 6:46 AM

-fixing a typo

Now that I see it in front of me I'm not sure it was a great idea to use a backslash in quoting. But I also don't have a much better idea. TeX math notation is at least widely recognized and used in all kinds of software.

Could we make the string literals in the code raw string literals though?

i.e., r'\sum' in Python, and R"\sum" in C++.

llvm/utils/llvm-locstats/llvm-locstats.py
268

I think this should be \/, a forward-slash, right?

llvm/docs/CommandGuide/llvm-dwarfdump.rst
117

Nit: #bytes is a shorthand I made up, in TeX it would actually be \#bytes. Is |bytes| a better notation?

djtodoro marked an inline comment as done.Apr 23 2020, 2:30 PM

Now that I see it in front of me I'm not sure it was a great idea to use a backslash in quoting. But I also don't have a much better idea. TeX math notation is at least widely recognized and used in all kinds of software.

The alternative is just to introduce our own syntax and document it well.
i. e.

\sum_{all_variables} ==> SUM_ALL(variables)
\# ==> #bytes
etc.

WDYT?

Could we make the string literals in the code raw string literals though?

i.e., r'\sum' in Python, and R"\sum" in C++.

I think we can do that.

llvm/docs/CommandGuide/llvm-dwarfdump.rst
117

Nit: #bytes is a shorthand I made up, in TeX it would actually be \#bytes.

Oh, yes. I see, but it has just looked better to me :)

Is |bytes| a better notation?

I think that looks even more unclear to me.

llvm/test/tools/llvm-dwarfdump/X86/locstats.ll
25

Add -NEXT: where applicable. This improves the FileCheck error messages greatly in case the diagnostic changes.

Herald added a subscriber: cmtice. Apr 23 2020, 2:55 PM

\sum_{all_variables} ==> SUM_ALL(variables)
\# ==> #bytes

If we're using a more function-like syntax I less expect the summand to come after the closing )

Does this

sum(all local variables: #bytes in parent scope ...)

or

sum_all_local_vars(#bytes in parent scope ...)

SUM_all_local_vars(#bytes in parent scope ...)

work better?

When I said add documentation, I was actually thinking of a more in-depth write up of the format, what information is presented etc, probably in a separate sub-heading. I don't use the stastistics stuff mind you, so I don't know if it would be really beneficial.

djtodoro marked an inline comment as done.Apr 24 2020, 8:38 AM

\sum_{all_variables} ==> SUM_ALL(variables)
\# ==> #bytes

If we're using a more function-like syntax I less expect the summand to come after the closing )

Does this

sum(all local variables: #bytes in parent scope ...)

or

sum_all_local_vars(#bytes in parent scope ...)

This is my favorite, but we'll see how that looks visually (when finished). :)

SUM_all_local_vars(#bytes in parent scope ...)

work better?

When I said add documentation, I was actually thinking of a more in-depth write up of the format, what information is presented etc, probably in a separate sub-heading. I don't use the stastistics stuff mind you, so I don't know if it would be really beneficial.

OK. Basically, we'll need an in-depth write up, since we will introduce the function-like syntax.

llvm/test/tools/llvm-dwarfdump/X86/locstats.ll
25

Sure. I didn't consider it, since it was present, but I'll refactor that as well.

djtodoro updated this revision to Diff 260615.Tue, Apr 28, 6:33 AM

-use a function-like syntax

TODO: use the string literals to avoid \\.
(@aprantl I think we can use the string literals only in C++ part, since I think r'..'feature in Python was introduced in Python 3.x, but we still use Python 2.7 for the llvm-locstats)

Herald added a project: Restricted Project. Tue, Apr 28, 6:34 AM
llvm/docs/CommandGuide/llvm-dwarfdump.rst
169

The :option:--statistics option generates single-line JSON output
representing quality metrics of the processed debug info. These metrics are useful
to compare changes between two compilers, particularly for judging the effect
that a change to the compiler has on the debug info quality.

The output is formatted as key-value pairs. The first pair contains a version number.
The following naming scheme is used for the keys:

...

175

why all only in the second case?

178

For aggregated values, the following keys are used:

182

llvm/docs/CommandGuide/llvm-dwarfdump.rst
116

You might want to add a link to the section below, so that users browsing on the web can click and jump to it.

djtodoro marked 4 inline comments as done.Wed, Apr 29, 6:22 AM
llvm/docs/CommandGuide/llvm-dwarfdump.rst
116

169

Sorry I haven't had much time to take another look into the write-up, but this looks very good! Thanks a lot for the suggestion!

175

Fixed.

182

Much better!

djtodoro updated this revision to Diff 260902.Wed, Apr 29, 6:22 AM

-refactoring the docs

aprantl accepted this revision.Wed, Apr 29, 11:09 AM
This revision is now accepted and ready to land.Wed, Apr 29, 11:09 AM
jhenderson accepted this revision.Thu, Apr 30, 1:43 AM

Thanks, doc changes look good to me. I assume you have built the documentation?

Thanks for the review!

Thanks, doc changes look good to me. I assume you have built the documentation?

Yes, I have. But I'll double check that before committing. Thanks!

This revision was automatically updated to reflect the committed changes.