This is an archive of the discontinued LLVM Phabricator instance.

Add total function byte size and inline function byte size to "llvm-dwarfdump --statistics"
ClosedPublic

Authored by clayborg on Nov 7 2018, 11:29 AM.

Details

Summary

A nice metric to know is how much code you have in debug info (in bytes) versus the the number of bytes for inlined functions. This patch reports the total number of bytes for concrete functions (size of all DW_TAG_subprogram DIE address ranges), and the total number of bytes for top level inlined functions. This can help users know how much code has been inlined and help users tune for code size.

The formatted JSON output looks like:

{
  "version":1,
  "file":"/tmp/a.out.dSYM/Contents/Resources/DWARF/a.out",
  "format":"Mach-O 64-bit x86-64",
  "source functions":4,
  "inlined functions":6,
  "unique source variables":8,
  "source variables":10,
  "variables with location":10,
  "call site entries":0,
  "scope bytes total":789,
  "scope bytes covered":789,
  "total function size":147,
  "total inlined function size":39
}

The last two keys "total function size" and "total inlined function size" are new.

Diff Detail

Event Timeline

clayborg created this revision.Nov 7 2018, 11:29 AM
clayborg edited the summary of this revision. (Show Details)Nov 7 2018, 11:29 AM

Thanks. There seems to be some overlap with

[llvm-dev] RFC: Adding a code size analysis tool
https://lists.llvm.org/pipermail/llvm-dev/2018-September/126501.html

how was that discussion resolved?

aprantl added a reviewer: vsk.Nov 7 2018, 11:37 AM

Thanks. There seems to be some overlap with

[llvm-dev] RFC: Adding a code size analysis tool
https://lists.llvm.org/pipermail/llvm-dev/2018-September/126501.html

how was that discussion resolved?

I think the response was generally positive but I haven't had the time to push things into tree. @clayborg, have you tried out the tool here: https://github.com/vedantk/llvm-project ? The RFC has some usage instructions.

If you just want the ratio of inlined to not-inlined bytes, this patch is a simpler way to do it. But if you need something more fine-grained (i.e. *which* functions were inlined more compared to a baseline, by how much, etc.), it might be worth checking out. Happy to have someone else run with the code.

In D54217#1290504, @vsk wrote:

Thanks. There seems to be some overlap with

[llvm-dev] RFC: Adding a code size analysis tool
https://lists.llvm.org/pipermail/llvm-dev/2018-September/126501.html

how was that discussion resolved?

I think the response was generally positive but I haven't had the time to push things into tree. @clayborg, have you tried out the tool here: https://github.com/vedantk/llvm-project ? The RFC has some usage instructions.

If you just want the ratio of inlined to not-inlined bytes, this patch is a simpler way to do it. But if you need something more fine-grained (i.e. *which* functions were inlined more compared to a baseline, by how much, etc.), it might be worth checking out. Happy to have someone else run with the code.

This current patch is indeed for an overview of inlined to not-inlined bytes. I have other patches to --statistics that will display more output coming. Breaking down exact sizes of inlined functions by count and size will be part of this future output. But this current one is just for a quick breakdown. Like you said in the URL provided, it is nice to know that say 30% of your code is inlined. If you are expecting less inlining and try a new option, this can be a quick way to see results in a very concise kind of way.

vsk added a comment.Nov 7 2018, 1:29 PM
In D54217#1290504, @vsk wrote:

Thanks. There seems to be some overlap with

[llvm-dev] RFC: Adding a code size analysis tool
https://lists.llvm.org/pipermail/llvm-dev/2018-September/126501.html

how was that discussion resolved?

I think the response was generally positive but I haven't had the time to push things into tree. @clayborg, have you tried out the tool here: https://github.com/vedantk/llvm-project ? The RFC has some usage instructions.

If you just want the ratio of inlined to not-inlined bytes, this patch is a simpler way to do it. But if you need something more fine-grained (i.e. *which* functions were inlined more compared to a baseline, by how much, etc.), it might be worth checking out. Happy to have someone else run with the code.

This current patch is indeed for an overview of inlined to not-inlined bytes. I have other patches to --statistics that will display more output coming. Breaking down exact sizes of inlined functions by count and size will be part of this future output. But this current one is just for a quick breakdown. Like you said in the URL provided, it is nice to know that say 30% of your code is inlined. If you are expecting less inlining and try a new option, this can be a quick way to see results in a very concise kind of way.

Sure, I have no objections. As you're thinking about future patches, though, I'd just suggest that you take a look at what can be salvaged/shared from my size tool prototype. There's some neat stuff there, like displaying full inlining trees. There's also an option to treat a batch of dSYMs as one unit, so you can compare all of the dsyms from one build of an OS to another, and see which symbols grew the most, were inlined the most, etc. If that's out of scope for what you need, that's fine too.

In D54217#1290632, @vsk wrote:
In D54217#1290504, @vsk wrote:

Thanks. There seems to be some overlap with

[llvm-dev] RFC: Adding a code size analysis tool
https://lists.llvm.org/pipermail/llvm-dev/2018-September/126501.html

how was that discussion resolved?

I think the response was generally positive but I haven't had the time to push things into tree. @clayborg, have you tried out the tool here: https://github.com/vedantk/llvm-project ? The RFC has some usage instructions.

If you just want the ratio of inlined to not-inlined bytes, this patch is a simpler way to do it. But if you need something more fine-grained (i.e. *which* functions were inlined more compared to a baseline, by how much, etc.), it might be worth checking out. Happy to have someone else run with the code.

This current patch is indeed for an overview of inlined to not-inlined bytes. I have other patches to --statistics that will display more output coming. Breaking down exact sizes of inlined functions by count and size will be part of this future output. But this current one is just for a quick breakdown. Like you said in the URL provided, it is nice to know that say 30% of your code is inlined. If you are expecting less inlining and try a new option, this can be a quick way to see results in a very concise kind of way.

Sure, I have no objections. As you're thinking about future patches, though, I'd just suggest that you take a look at what can be salvaged/shared from my size tool prototype. There's some neat stuff there, like displaying full inlining trees. There's also an option to treat a batch of dSYMs as one unit, so you can compare all of the dsyms from one build of an OS to another, and see which symbols grew the most, were inlined the most, etc. If that's out of scope for what you need, that's fine too.

I will definitely take a look when I get to that point!

So sounds like we are ok with the idea of this patch. Any comments on the patch itself?

vsk added a comment.Nov 7 2018, 5:13 PM

Looks reasonable to me, but it'd help to check in a .o for a test.

In D54217#1291170, @vsk wrote:

Looks reasonable to me, but it'd help to check in a .o for a test.

There don't seem to be any tests for any of this. Are there existing tests for stuff that is only in llvm-dwarfdump? If so, please point me to where they are.

In D54217#1291170, @vsk wrote:

Looks reasonable to me, but it'd help to check in a .o for a test.

There don't seem to be any tests for any of this. Are there existing tests for stuff that is only in llvm-dwarfdump? If so, please point me to where they are.

Where/how did you look for tests? I know they're not perfectly organized, but my usual approach is to search for some existing text nearby (in this case I searched the github for 'scope bytes covered' the statistic just before the ones you added - and I found this test case: https://github.com/llvm-mirror/llvm/blob/9d809925c655ce689c4a99025c180f15ac8a7a53/test/tools/llvm-dwarfdump/X86/statistics.ll ) or to look at the revision history of the feature I'm adding to & see what the last functional change was there, and if/how it was tested.

yeah, an easy code search indeed kicked up the tests. I will add one. Sorry about that

Looks like I can piggy back on llvm/test/tools/llvm-dwarfdump/X86/statistics.ll

Does anyone know how to run only this test from command line?

Looks like I can piggy back on llvm/test/tools/llvm-dwarfdump/X86/statistics.ll

Does anyone know how to run only this test from command line?

In the bin directory of your build, there should be llvm-lit.py; run that, passing the relative filespec of the test file (full filespec probably works too but I never use it that way). This will give you a pass/fail on the one test file. Add -v to get detailed output if it fails.

yeah, an easy code search indeed kicked up the tests. I will add one. Sorry about that

No worries - still curious where you were looking for the tests initially, perhaps we can do something to make them more discoverable?

(another way I find tests: Break something nearby/related (eg: remove one of the existing statistics) & see what tests fail)

Looks like I can piggy back on llvm/test/tools/llvm-dwarfdump/X86/statistics.ll

Does anyone know how to run only this test from command line?

from the build directory I usually run "./bin/llvm-lit -v test/tools/llvm-dwarfdump/X86/statistics.ll" for example (yeah, even though that relative path doesn't exist in the build directory, lit knows to also try resolving it from the source dir, etc)

clayborg added a comment.EditedNov 8 2018, 10:27 AM

yeah, an easy code search indeed kicked up the tests. I will add one. Sorry about that

No worries - still curious where you were looking for the tests initially, perhaps we can do something to make them more discoverable?

The only thing I could think of that could make it easier is to make a symlink from the tools/llvm-dwarfdump/test which points to ../../test/tools/llvm-dwarfdump

(another way I find tests: Break something nearby/related (eg: remove one of the existing statistics) & see what tests fail)

Looks like I can piggy back on llvm/test/tools/llvm-dwarfdump/X86/statistics.ll

Does anyone know how to run only this test from command line?

from the build directory I usually run "./bin/llvm-lit -v test/tools/llvm-dwarfdump/X86/statistics.ll" for example (yeah, even though that relative path doesn't exist in the build directory, lit knows to also try resolving it from the source dir, etc)

Thanks! Will be an easy tests to add. One other FileCheck question. Seems like I would want to check that "total function size" and "total inlined function size" exist. Can I rely on the exact byte sizes from this .ll file on all systems? I guess I don't really care what the value is, but I would like to verify that "total function size" > "total inlined function size". Is there a way to do that?

CHECK: "total function size":[[FUNCSIZE:[0-9]+]]
CHECK: "total inlined function size":[[INLINESIZE:[0-9]+]]

Is there a way to check FUNCSIZE > INLINESIZE?

yeah, an easy code search indeed kicked up the tests. I will add one. Sorry about that

No worries - still curious where you were looking for the tests initially, perhaps we can do something to make them more discoverable?

The only thing I could think of that could make it easier is to make a symlink from the tools/llvm-dwarfdump/test which points to ../../test/tools/llvm-dwarfdump

Ah, fair enough - yeah, might just be a hurdle we can live with, then. The tests mostly match the directory hierarchy - but yeah, there are a few quirks.

(another way I find tests: Break something nearby/related (eg: remove one of the existing statistics) & see what tests fail)

Looks like I can piggy back on llvm/test/tools/llvm-dwarfdump/X86/statistics.ll

Does anyone know how to run only this test from command line?

from the build directory I usually run "./bin/llvm-lit -v test/tools/llvm-dwarfdump/X86/statistics.ll" for example (yeah, even though that relative path doesn't exist in the build directory, lit knows to also try resolving it from the source dir, etc)

Thanks! Will be an easy tests to add. One other FileCheck question. Seems like I would want to check that "total function size" and "total inlined function size" exist.
Can I rely on the exact byte sizes from this .ll file on all systems?

Only if the test has a specific triple. Which, by the looks of it, this test could benefit from. @aprantl - would that make sense? This test is only run when the X86 target is enabled anyway, so adding a specific triple wouldn't make it less portable - and would allow for specific statistic values to be tested, which seems like it'd be important for test coverage here (as it stands, byte count parts of this test look like it would still pass even if the statistics were miscomputed/all zero, etc).

I guess I don't really care what the value is, but I would like to verify that "total function size" > "total inlined function size". Is there a way to do that?

CHECK: "total function size":[[FUNCSIZE:[0-9]+]]
CHECK: "total inlined function size":[[INLINESIZE:[0-9]+]]

Is there a way to check FUNCSIZE > INLINESIZE?

Yeah, there's no way to do that, unfortunately - but I think (will wait for Adrian to weigh in here, as the original author/has more context for this test/features than I do) it'd make sense to put a specific triple on this test and test specific values for all the stats here.
If adding new features to test to this test case perturbs all the other statistics in a way that makes this test a pain to maintain (I doubt it will - probably cheap enough to change the test, eyeball all the new stat values, and just update these "golden" values when we have to) it may be better to split new features that would otherwise perturb things into new test files.

clayborg updated this revision to Diff 173231.Nov 8 2018, 2:46 PM

Added to the lit tests to ensure the function size and inline function sizes are present

vsk added a comment.Nov 8 2018, 2:56 PM

This looks good to me. I think it'd help to have a +1 from Adrian, as he's worked on this file much more than I have.

aprantl accepted this revision.Nov 8 2018, 4:35 PM
This revision is now accepted and ready to land.Nov 8 2018, 4:35 PM

yeah, an easy code search indeed kicked up the tests. I will add one. Sorry about that

No worries - still curious where you were looking for the tests initially, perhaps we can do something to make them more discoverable?

The only thing I could think of that could make it easier is to make a symlink from the tools/llvm-dwarfdump/test which points to ../../test/tools/llvm-dwarfdump

Ah, fair enough - yeah, might just be a hurdle we can live with, then. The tests mostly match the directory hierarchy - but yeah, there are a few quirks.

(another way I find tests: Break something nearby/related (eg: remove one of the existing statistics) & see what tests fail)

Looks like I can piggy back on llvm/test/tools/llvm-dwarfdump/X86/statistics.ll

Does anyone know how to run only this test from command line?

from the build directory I usually run "./bin/llvm-lit -v test/tools/llvm-dwarfdump/X86/statistics.ll" for example (yeah, even though that relative path doesn't exist in the build directory, lit knows to also try resolving it from the source dir, etc)

Thanks! Will be an easy tests to add. One other FileCheck question. Seems like I would want to check that "total function size" and "total inlined function size" exist.
Can I rely on the exact byte sizes from this .ll file on all systems?

Only if the test has a specific triple. Which, by the looks of it, this test could benefit from. @aprantl - would that make sense? This test is only run when the X86 target is enabled anyway, so adding a specific triple wouldn't make it less portable - and would allow for specific statistic values to be tested, which seems like it'd be important for test coverage here (as it stands, byte count parts of this test look like it would still pass even if the statistics were miscomputed/all zero, etc).

@aprantl - any thoughts on the test coverage here. Currently there's no testing that these counters are correct/meaningful - just that they are printed out. Reckon it's worth putting a specific triple here & testing the values to ensure the computation is correct/doesn't regress?

I'm afraid that any numbers won't be really stable — especially the ones about code size. In fact, we are getting a notifications whenever any of those values regresses via LNT (http://lnt.llvm.org/db_default/v4/nts/114748), so that kind of covers the values of the statistics?
For other parameters (number of non-inlined functions), it may be more obvious that expecting a hard-coded value is the right thing.

vsk added a comment.Nov 8 2018, 5:04 PM

Why not check in a .o? The numbers won't change, so they can meaningfully be compared.

In D54217#1292323, @vsk wrote:

Why not check in a .o? The numbers won't change, so they can meaningfully be compared.

An object file or assembly seems fine, yeah.

In D54217#1292323, @vsk wrote:

Why not check in a .o? The numbers won't change, so they can meaningfully be compared.

An object file or assembly seems fine, yeah.

I should say - I know this can be a bit more of a pain to maintain - modifications to the test case usually involve rerunning the compiler, etc. (though the same is true for LLVM IR debug info metadata usually - we don't generally hand-craft it) - and I wouldn't be averse to an IR test case that was stabilized - there are ways to pretty thoroughly restrict the compiler's ability to optimize anything too far (using function calls (to functions declared but not defined), inline assembly, etc).

But yeah, assembly or object files are probably the right tool here - very few (& I probably would've pushed back on this one going in in the first place if I'd noticed) test case for dwarfdump use IR, the rest use object files or assembly.

This revision was automatically updated to reflect the committed changes.