This is an archive of the discontinued LLVM Phabricator instance.

[utils] Add the llvm-locstats tool
ClosedPublic

Authored by djtodoro on Aug 21 2019, 5:26 AM.

Details

Summary

As we discussed on the D65585, this patch introduces the llvm-locstats tool.

The line 0 shows the number (and the percentage) of DIEs with no location information, but the line 100 shows the number (and the percentage) of DIEs where there is location information in all code section bytes (where the variable or parameter is in the scope). The line 51..59 shows the number (and the percentage) of DIEs where the location information is in between 51 and 59 percentage of its scope covered. There are options for considering only local variables or formal parameters, and also there is an option for reporting the stats by ignoring the debug entry values generated.

Using the tool

llvm-locstats test.o

=================================================
          Debug Location Statistics
=================================================
      cov%          samples       percentage(~)
-------------------------------------------------
        0                 1              16%
     1..9                 0               0%
   10..19                 0               0%
   20..29                 0               0%
   30..39                 0               0%
   40..49                 0               0%
   50..99                 1              16%
   60..69                 0               0%
   70..79                 0               0%
   80..89                 1              16%
   90..99                 0               0%
      100                 3              50%
=================================================
-the number of debug variables processed: 6
-PC ranges covered: 81%
-------------------------------------------------
-total availability: 83%
=================================================

Or,

llvm-locstats -ignore-debug-entry-values test.o

=================================================
           Debug Location Statistics
=================================================
      cov%          samples       percentage(~)
-------------------------------------------------
        0                 1              16%
     1..9                 0               0%
   10..19                 0               0%
   20..29                 0               0%
   30..39                 0               0%
   40..49                 0               0%
   50..99                 2              33%
   60..69                 0               0%
   70..79                 0               0%
   80..89                 1              16%
   90..99                 0               0%
      100                 2              33%
=================================================
-the number of debug variables processed: 6
-PC ranges covered: 76%
-------------------------------------------------
-total availability: 83%
=================================================

Diff Detail

Repository
rL LLVM

Event Timeline

djtodoro created this revision.Aug 21 2019, 5:26 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 21 2019, 5:26 AM
vsk added a comment.Aug 21 2019, 9:36 AM

I really like this approach. Thanks for doing this!

llvm/utils/llvm-locstats/llvm-locstats.py
107 ↗(On Diff #216377)

Please use subprocess.call or Popen along with PIPE to avoid round-tripping to the filesystem, leaving stale files around by accident, etc.

109 ↗(On Diff #216377)

Please use os.path.join to support Windows.

Orlando added inline comments.
llvm/utils/llvm-locstats/llvm-locstats.py
107 ↗(On Diff #216377)

Would it be worth using tempfile.NamedTemporaryFile instead of hard coding the temporary file name here?

djtodoro marked 2 inline comments as done.Aug 22 2019, 7:39 AM

@vsk

I really like this approach. Thanks for doing this!

Thanks!

llvm/utils/llvm-locstats/llvm-locstats.py
107 ↗(On Diff #216377)

Please use subprocess.call or Popen along with PIPE to avoid round-tripping to the filesystem, leaving stale files around by accident, etc.

Sure.

Would it be worth using tempfile.NamedTemporaryFile instead of hard coding the temporary file name here?

I think so, thanks!

109 ↗(On Diff #216377)

Sure.

djtodoro updated this revision to Diff 217374.Aug 27 2019, 5:14 AM
djtodoro edited the summary of this revision. (Show Details)

-Rebase
-Use the tempfile.TemporaryFile
-Use subprocess.calll
-Use os.path.join

vsk added inline comments.Aug 28 2019, 1:41 PM
llvm/utils/llvm-locstats/llvm-locstats.py
167 ↗(On Diff #217374)

Still not sure why this needs to touch the filesystem. Why not use a pipe? Like:

subproc = Popen(cmd, stdin=PIPE, stdout=PIPE, stderr=PIPE, universal_newlines = True)
cmd_stdout, cmd_stderr = subproc.communicate()

213 ↗(On Diff #217374)

I think it'd be great to simplify this. Note that "1-9%" is covered, as is "11-19%", but that "10%" is missed. It should be possible to write a helper to get the buckets, like:

def coverage_buckets():
  yield '0%'
  for start in range(1, 92, 10):
    yield '{0}%-{1}%'.format(start, start + 9)
  yield '100%'

You can use a helper like that to use a for-loop to extract all of the "vars with $bucket of its scope covered" fields, etc.

254 ↗(On Diff #217374)

Ditto here and everywhere else we need to look up coverage buckets within json_parsed.

djtodoro updated this revision to Diff 218063.Aug 30 2019, 3:25 AM
  • Simplify the code
  • Rebase

@vsk Thanks for your comments!

vsk added inline comments.Aug 30 2019, 9:31 AM
llvm/utils/llvm-locstats/llvm-locstats.py
7 ↗(On Diff #218063)

I think this either needs '#!/usr/bin/env python3' or 'from future import print_function'.

175 ↗(On Diff #218063)

What's the difference between this ('total variables procesed by location statistics') and 'total vars procesed by location statistics' above?

djtodoro marked 2 inline comments as done.Sep 2 2019, 3:32 AM
djtodoro added inline comments.
llvm/utils/llvm-locstats/llvm-locstats.py
7 ↗(On Diff #218063)

I see.. Thanks!

The other LLVM python scripts uses the print function from future package, so I will use that as well.

175 ↗(On Diff #218063)

I just followed the existing name convention within llvm-dwarfdump statistics. Reporting stats with "variables" applies to both local variables and formal parameters, "vars" applies only to local variables and "formal parameters" obviously applies only to formal parameters.

djtodoro updated this revision to Diff 218325.Sep 2 2019, 3:41 AM
-Use `print_function ` from `__future__`
vsk accepted this revision.Sep 3 2019, 1:11 PM

Thanks, this looks good to me. Perhaps you can get more feedback by dropping the WIP label. @aprantl, any thoughts?

llvm/utils/llvm-locstats/llvm-locstats.py
175 ↗(On Diff #218063)

Ok. Perhaps we can revisit that & pick a clearer naming convention later.

This revision is now accepted and ready to land.Sep 3 2019, 1:11 PM
djtodoro marked an inline comment as done.Sep 5 2019, 7:49 AM
djtodoro added inline comments.
llvm/utils/llvm-locstats/llvm-locstats.py
175 ↗(On Diff #218063)

I agree with that.
Thanks a lot for the review and comments! I really appreciate it!

djtodoro updated this revision to Diff 218921.Sep 5 2019, 7:49 AM
djtodoro retitled this revision from WIP: [utils] Add the llvm-locstats tool to [utils] Add the llvm-locstats tool.
djtodoro edited the summary of this revision. (Show Details)

-Add the documentation
-Add the test case
-Adjust the CMake

Is this OK to go?

@aprantl Do you have any additional comments ? :)

aprantl accepted this revision.Sep 9 2019, 2:07 PM
This revision was automatically updated to reflect the committed changes.
djtodoro reopened this revision.Sep 12 2019, 8:55 AM

This was reverted with the rL371527 due to test failure on Windows platform.

This revision is now accepted and ready to land.Sep 12 2019, 8:55 AM
djtodoro updated this revision to Diff 219933.Sep 12 2019, 9:15 AM

-Adjust the CMake
-Adjust the tool to work on Windows (skip the backslash character from JSON string)
-Adjust the test case to work on Windows

djtodoro requested review of this revision.Sep 12 2019, 9:16 AM
vsk added inline comments.Sep 12 2019, 5:40 PM
utils/llvm-locstats/llvm-locstats.py
120 ↗(On Diff #219933)

I'm a bit confused by this. Mind sharing a link / description for the error?

Did llvm-dwarfdump produce invalid (unescaped) JSON, thus causing python's JSON parser to fail?

djtodoro marked an inline comment as done.Sep 12 2019, 11:55 PM
djtodoro added inline comments.
utils/llvm-locstats/llvm-locstats.py
120 ↗(On Diff #219933)

Sure.

When try to load the JSON from llvm-dwarfdump output with loads() from json package it trows an exception on Windows platform.

Invalid \escape: line 1 column 23 (char 23)

The reason of the exception is special character '\' from Windows path. I think that llvm-dwarfdump should handle that, so this won't be needed here.

djtodoro marked an inline comment as done.Sep 17 2019, 2:08 AM
djtodoro added inline comments.
utils/llvm-locstats/llvm-locstats.py
120 ↗(On Diff #219933)

@vsk Do you think we should handle this on the llvm-dwarfdump side, since we have the JSON that breaks the specification (the JSON spec)?

vsk added inline comments.Sep 17 2019, 11:24 AM
utils/llvm-locstats/llvm-locstats.py
120 ↗(On Diff #219933)

Absolutely, llvm-dwarfdump needs to produce valid json, as I'm sure there are clients that rely on that. I understand that may be more work than you signed up for :). If you don't have the cycles for that, let me know.

djtodoro marked an inline comment as done.Sep 17 2019, 11:28 PM
djtodoro added inline comments.
utils/llvm-locstats/llvm-locstats.py
120 ↗(On Diff #219933)

This will be a small fix. Thanks for your comments and reviews! :)

@vsk I believe this can go again, according to the comment from D67699?

vsk added a comment.Sep 20 2019, 12:09 PM

Yes, please go for it :)

This revision was not accepted when it landed; it landed in state Needs Review.Sep 23 2019, 12:56 AM
This revision was automatically updated to reflect the committed changes.

This patch seems to have broken the build

llvm-lit: /redacted/git/llvm-project/llvm/utils/lit/lit/llvm/config.py:340: note: using clang: /tmp/llvm-project_dbg_compiled-with-clang/bin/clang
FAIL: LLVM :: tools/llvm-dwarfdump/X86/locstats.ll (47789 of 49472)
******************** TEST 'LLVM :: tools/llvm-dwarfdump/X86/locstats.ll' FAILED ********************
Script:
--
: 'RUN: at line 1';   /tmp/llvm-project_dbg_compiled-with-clang/bin/llc -debug-entry-values /redacted/git/llvm-project/llvm/test/tools/llvm-dwarfdump/X86/locstats.ll -o - -filetype=obj    | /tmp/llvm-project_dbg_compiled-with-clang/bin/llvm-dwarfdump -statistics - | /tmp/llvm-project_dbg_compiled-with-clang/bin/FileCheck /redacted/git/llvm-project/llvm/test/tools/llvm-dwarfdump/X86/locstats.ll
: 'RUN: at line 4';   /tmp/llvm-project_dbg_compiled-with-clang/bin/llc -debug-entry-values /redacted/git/llvm-project/llvm/test/tools/llvm-dwarfdump/X86/locstats.ll -o /tmp/llvm-project_dbg_compiled-with-clang/test/tools/llvm-dwarfdump/X86/Output/locstats.ll.tmp0.o -filetype=obj    | '/usr/bin/python' /tmp/llvm-project_dbg_compiled-with-clang/./bin/llvm-locstats /tmp/llvm-project_dbg_compiled-with-clang/test/tools/llvm-dwarfdump/X86/Output/locstats.ll.tmp0.o | /tmp/llvm-project_dbg_compiled-with-clang/bin/FileCheck /redacted/git/llvm-project/llvm/test/tools/llvm-dwarfdump/X86/locstats.ll --check-prefix=LOCSTATS
--
Exit Code: 2

Command Output (stderr):
--
/usr/bin/python: can't open file '/tmp/llvm-project_dbg_compiled-with-clang/./bin/llvm-locstats': [Errno 2] No such file or directory
FileCheck error: '-' is empty.
FileCheck command line:  /tmp/llvm-project_dbg_compiled-with-clang/bin/FileCheck /redacted/git/llvm-project/llvm/test/tools/llvm-dwarfdump/X86/locstats.ll --check-prefix=LOCSTATS

--

@gchatelet What build is this about? Since I can not see the failure on the http://lab.llvm.org:8011/console.

@gchatelet What build is this about? Since I can not see the failure on the http://lab.llvm.org:8011/console.

@djtodoro my apologies I think it's a local cmake caching problem. Sorry for the noise.

@gchatelet No problem :) I am happy if that is the case!

I found a build failure on the sanitizer-x86_64-linux-fast. I will investigate that..

I found a build failure on the sanitizer-x86_64-linux-fast. I will investigate that..

Yes even with a clean build it's still failing, here is my reproducer:

cmake -DCMAKE_BUILD_TYPE=Debug -DCMAKE_C_COMPILER=/usr/local/bin/clang -DCMAKE_ASM_COMPILER=/usr/local/bin/clang -DCMAKE_CXX_COMPILER=/usr/local/bin/clang++ -DCMAKE_EXPORT_COMPILE_COMMANDS=ON -DBUILD_SHARED_LIBS=ON -DLLVM_USE_SPLIT_DWARF=ON -DLLVM_OPTIMIZED_TABLEGEN=ON '-DLLVM_EXPERIMENTAL_TARGETS_TO_BUILD=ARC;AVR' -DLLVM_ENABLE_PROJECTS=clang -DLLVM_BUILD_TESTS=ON -DLLVM_ENABLE_WERROR=ON -H/redacted/llvm-project/llvm -B/tmp/llvm-project_dbg_compiled-with-clang -GNinja

ninja -C /tmp/llvm-project_dbg_compiled-with-clang check-all
djtodoro reopened this revision.Sep 23 2019, 4:11 AM

This is reverted with the rL372580 while investigating.

The problem is within CMake changes. In some configurations the tool is not copied into the right directory, so the test fails due to the reason.

djtodoro updated this revision to Diff 222101.Sep 27 2019, 2:01 AM

-Make the testcase unsupported when needed

@vsk WDYT?

vsk added a comment.Sep 27 2019, 10:26 AM

This is checking for the existence of llvm-locstats before running the test? Seems reasonable to me, please go ahead and feel free to fix forwards. Thanks!

djtodoro accepted this revision.Sep 29 2019, 11:26 PM

This is checking for the existence of llvm-locstats before running the test?

Yes.

Thanks again!!

This revision is now accepted and ready to land.Sep 29 2019, 11:26 PM
This revision was automatically updated to reflect the committed changes.

I see this is reverted again; a couple comments while trying to import this:

llvm/trunk/test/tools/llvm-locstats/locstats.ll
1–2

The pipe should be deleted; llc is generating %t0.o as an intermediate object file and llvm-locstats is parsing that, not stdin. The stdout of llc is empty.

(If you want to support stdin, that should be a different test case)

llvm/trunk/utils/llvm-locstats/llvm-locstats.py
107–109

This assumes that llvm-dwarfdump exists in the same directory as llvm-locstats, which may not be true, e.g. for build systems that build dwardump and locstats in separate build trees. Can this be configurable? Possibly a regular $PATH lookup makes sense here (with cwd as a preferred location), although allowing a flag pointing to the location of dwarfdump would be the most flexible.

rupprecht added inline comments.Oct 1 2019, 11:22 AM
llvm/trunk/utils/llvm-locstats/llvm-locstats.py
107–109

(Note: feel free to reland this bit as-is, I can suggest this in the form of a patch once this landing has stuck).

djtodoro marked 2 inline comments as done.Oct 1 2019, 11:00 PM
djtodoro added inline comments.
llvm/trunk/test/tools/llvm-locstats/locstats.ll
1–2

Thanks! I have missed this..

llvm/trunk/utils/llvm-locstats/llvm-locstats.py
107–109

Sure... A custom build system could make situation like that, I also thought about some improvements, but that can go as separate patches.

(Note: feel free to reland this bit as-is, I can suggest this in the form of a patch once this landing has stuck).

Let's do that way. Please feel free to post such patch! :)