This is an archive of the discontinued LLVM Phabricator instance.

llvm/tools/llvm-dwarfdump/Statistics.cpp
416–417	I think this assert should come before the assignment and be something like this to catch a value that would "overflow" by wrapping: `assert(GlobalStats.ScopeBytesCovered + ScopeBytesCovered >= GlobalStats.ScopeBytesCovered && "ScopeBytesCovered - overflow");` Otherwise I don't think this assertion will ever catch anything, since all uint64_t values are <= UINT64_MAX.

Seems like a generally reasonabel direction forward.

llvm/tools/llvm-dwarfdump/Statistics.cpp
416–417	Yep! I think the more general assert probably looks like this: assert(x <= max - y) x += y;
418–419	When would this assert fire? If `ScopeEntryValueBytesCovered` is a uint64_t, it can't ever be > than the max uint64_t value. Checking for overflow would usually be done, I tihnk, with a check before the overflow: assert(x <= max - y) x += y;

no other comments, thanks for the fix!

• hafixo added a commit: rCRT373035: hwasan: Compatibility fixes for short granules..Sep 6 2021, 12:44 AM

• hafixo added a commit: rGc336557f0238: hwasan: Compatibility fixes for short granules..Sep 6 2021, 12:47 AM

djtodoro added inline comments.Sep 6 2021, 2:01 AM

llvm/tools/llvm-dwarfdump/Statistics.cpp
416–417	Oh yes, thanks

Could something ala:
https://gcc.gnu.org/onlinedocs/gcc/Integer-Overflow-Builtins.html
help?

In D109217#2984816, @tschuett wrote:

Could something ala:
https://gcc.gnu.org/onlinedocs/gcc/Integer-Overflow-Builtins.html
help?

Probably enough to write the portable code (so it works on MSVC too, etc) and let the compiler optimize it. At least for this saturation code, Clang produces the same with or without the intrinsic (& GCC produces something else entirely - to itself and to clang): https://godbolt.org/z/jb3nnaKvT

In D109217#2985644, @dblaikie wrote:

In D109217#2984816, @tschuett wrote:

Could something ala:
https://gcc.gnu.org/onlinedocs/gcc/Integer-Overflow-Builtins.html
help?

Probably enough to write the portable code (so it works on MSVC too, etc) and let the compiler optimize it. At least for this saturation code, Clang produces the same with or without the intrinsic (& GCC produces something else entirely - to itself and to clang): https://godbolt.org/z/jb3nnaKvT

Maybe LLVM should learn saturating integers that assert in debug mode?

In D109217#2985647, @tschuett wrote:

In D109217#2985644, @dblaikie wrote:

In D109217#2984816, @tschuett wrote:

Could something ala:
https://gcc.gnu.org/onlinedocs/gcc/Integer-Overflow-Builtins.html
help?

Probably enough to write the portable code (so it works on MSVC too, etc) and let the compiler optimize it. At least for this saturation code, Clang produces the same with or without the intrinsic (& GCC produces something else entirely - to itself and to clang): https://godbolt.org/z/jb3nnaKvT

Maybe LLVM should learn saturating integers that assert in debug mode?

Perhaps. (though anything that asserts in debug mode should basically be UB in non-debug mode, in my opinion - if you aren't testing/using the functionality, it shouldn't be defined)

For ints we've got UBSan, but unsigned ints are defined on wrap. There's no unsigned type that's UB on overflow - certainly might be nice to have them to clarify the difference between a think you want to do weird bitfiddling with and expect all the overflow, etc, and a thing that's meant to do maths and where sanitizers could diagnose overflow, etc.

But I think a couple of manual overflow checks here is probably OK - might be worth putting it in a generic function and applying it to all the statistics to make things more robust/generic.

For ints we've got UBSan, but unsigned ints are defined on wrap. There's no unsigned type that's UB on overflow - certainly might be nice to have them to clarify the difference between a think you want to do weird bitfiddling with and expect all the overflow, etc, and a thing that's meant to do maths and where sanitizers could diagnose overflow, etc.

-fsanitize=unsigned-integer-overflow ?

In D109217#2985680, @xbolva00 wrote:

For ints we've got UBSan, but unsigned ints are defined on wrap. There's no unsigned type that's UB on overflow - certainly might be nice to have them to clarify the difference between a think you want to do weird bitfiddling with and expect all the overflow, etc, and a thing that's meant to do maths and where sanitizers could diagnose overflow, etc.

-fsanitize=unsigned-integer-overflow ?

Oh, right, we do have that :) (but no doubt LLVM isn't remotely clean of failures for it)

djtodoro mentioned this in D109347: [JSON] Handle uint64_t type.Sep 7 2021, 1:38 AM

djtodoro added a parent revision: D109347: [JSON] Handle uint64_t type.

add a test
bump the stats version
address the comments (I think that these few asserts are enough for this)
split JSON part into a separate patch

djtodoro retitled this revision from [NOT FOR COMMIT] [llvm-dwarfdump] Fix unsigned overflow when calculating stats to [llvm-dwarfdump] Fix unsigned overflow when calculating stats.Sep 7 2021, 1:49 AM

Harbormaster completed remote builds in B122833: Diff 371007.Sep 7 2021, 2:32 AM

Just one more inline question from me, but I will defer to the other reviewers for the rest & approval. Thanks for fixing this.

llvm/tools/llvm-dwarfdump/Statistics.cpp
789–804	I'm not sure how much this matters but looking at the comment above I don't think a version bump is necessary for this patch? Any input that didn't trip the assertions previously will still have the same output with the patch applied.

thopre removed a commit: rGc336557f0238: hwasan: Compatibility fixes for short granules..Sep 7 2021, 2:47 AM

thopre removed a commit: rCRT373035: hwasan: Compatibility fixes for short granules..Sep 7 2021, 2:51 AM

djtodoro added inline comments.Sep 7 2021, 3:28 AM

llvm/tools/llvm-dwarfdump/Statistics.cpp
789–804	I think since this is a bug fix we should bump it -- e.g., a stat number could have been 0 (as a consequence of the bug), and now it will be 2^32 for example, right? I think that this is the purpose behind the stats version.

Let me revisit the saturating integer without asserts. If you print 5 for an uint32_t, you will never know whether it overflowed never or 10 times (in release mode).

A saturating integer will print 5 or max int (saturated).

Even an SaturatingUint<uint64_t> shouldn't yield too much overhead.

In D109217#2987177, @tschuett wrote:

Let me revisit the saturating integer without asserts. If you print 5 for an uint32_t, you will never know whether it overflowed never or 10 times (in release mode).

A saturating integer will print 5 or max int (saturated).

Even an SaturatingUint<uint64_t> shouldn't yield too much overhead.

Hmm, is there an implementation of the SaturatingUint, or do we need to implement such type?

Not at the moment. I just wanted to pitch the idea of having a saturating integer in LLVM.

I think we can implement it here, and it will be useful/safe.
The question is if that should be implemented as a general thing in LLVM.

It seems like the down side of using asserts to detect the "overflow" (the patch's current approach) is that release-config users may still get misleading stats due to wrapping. Implementing a saturating int in and of itself doesn't seem like a full solution, since a user may not notice that the stat is the saturated value, or even know that it's special, especially if the stats are consumed by another tool.

IMO when the stat cannot be computed properly - however the detection is implemented, either with saturating ints, checks like the ones in the asserts, or something else - a good solution for users would be to print a message, and either skip printing the "bad" stats or all of them. That would be consistent for all build configurations and avoid hiding the issue. What do you think? Sorry if I'm just stating the obvious!

llvm/tools/llvm-dwarfdump/Statistics.cpp
789–804	Sounds reasonable (I wasn't thinking about release mode when I made that comment).

You will always know whether it is max int or max int because of saturating behaviour.

class SatUint32 {
  uint32_t value;
  bool overflowed;
}

The saturating integer class would use the builtins I mentioned above to perform arithmetic operations on value and detect overflow and set overflow to true.

I'd just saturate to max int, and use the max int value to indicate overflow. Shaving one value off to represent the overflow state seems fine to me.

That is fine be me. I guess the point is a save way to collect statistics and give guidance to users when the results could be bad, in release and debug mode. I would argue that saturating integers are different and maybe more precise solution than going from uint32_t to uint64_t ...

In D109217#2992699, @tschuett wrote:

That is fine be me. I guess the point is a save way to collect statistics and give guidance to users when the results could be bad, in release and debug mode. I would argue that saturating integers are different and maybe more precise solution than going from uint32_t to uint64_t ...

Well, both, probably - support use cases that weren't supported before (binaries that were too large to fit in the existing stats) and, separately/additionally, some way of reporting overflow rather than reporting bogus values.

Introduce the SaturatingUINT64

In D109217#2992694, @dblaikie wrote:

I'd just saturate to max int, and use the max int value to indicate overflow. Shaving one value off to represent the overflow state seems fine to me.

+1, do not need Overflow flag.

Harbormaster completed remote builds in B123657: Diff 372234.Sep 13 2021, 7:24 AM

-remove the isOverflow field

Harbormaster completed remote builds in B124372: Diff 373193.Sep 17 2021, 5:37 AM

dblaikie added inline comments.Sep 17 2021, 9:03 AM

llvm/tools/llvm-dwarfdump/Statistics.cpp
71	Personally, I think once we've defined the behavior on overflow, we shouldn't assert that overflow doesn't happen - this undermines the concept of having defined behavior on overflow (& makes it somewhat harder to test - since that behavior can now only be tested in a non-asserts build (well, I guess since assert isn't UB-if-false, and the assert is after the warning, that's not the case, but it's a bit subtle)). Also, might it make more sense to do the warning/etc on the final use/printing out of the statistic instead? (I guess that's difficult because some statistics are derived from others? - so catching it the moment it overflows means it'll always be diagnosed only once, rather than multiple times due to multiple uses?)

djtodoro added inline comments.Sep 20 2021, 8:15 AM

llvm/tools/llvm-dwarfdump/Statistics.cpp
71	Thanks for the suggestions. I totally agree. I guess that it makes sense to report the warning when printing the overflowed value, since we can point to the specific field.

addressing comments
- now the warning looks as follows:

"#call site DIEs": N (llvm-dwarfdump: warning: this field overflows),

Harbormaster completed remote builds in B124671: Diff 373595.Sep 20 2021, 8:25 AM

In D109217#3009505, @djtodoro wrote:
addressing comments

now the warning looks as follows:
"#call site DIEs": N (llvm-dwarfdump: warning: this field overflows),

Maybe we could render it symbolically and just say:

"#call site DIEs": >= 9223372036854775807

But yeah, maybe the warning is more suitable, not sure - I'll leave it up to you folks to decide what's best.

@cmtice @rdhindsa - might be handy if you folks are aware of this in terms of quirks when encoding the stats from our internal analysis pipelines, in case the format chosen here needs to be taken into account for how to render out of range data.

In D109217#3011103, @dblaikie wrote:
In D109217#3009505, @djtodoro wrote:
addressing comments

now the warning looks as follows:
"#call site DIEs": N (llvm-dwarfdump: warning: this field overflows),
Maybe we could render it symbolically and just say:
"#call site DIEs": >= 9223372036854775807
But yeah, maybe the warning is more suitable, not sure - I'll leave it up to you folks to decide what's best.

I'd prefer the warning since it will be easier when parsing the JSON data from utilities such as llvm-locstats.

djtodoro added a child revision: D110621: [llvm-locstats] Report a warning if overflow was detected by llvm-dwarfdump.Sep 28 2021, 5:10 AM

ping :)

dblaikie added inline comments.Oct 5 2021, 12:15 PM

llvm/test/tools/llvm-dwarfdump/X86/locstats-bytes-overflow.yaml
25	I think it'd be worth CHECKing the specific/full syntax, rather than just "this warning text appears somewhere in the output" - since we're specifically putting it in the output in a particular place. Hmm, that raises a question: is this warning going to stderr, but the actual stats output is to stdout? If so then I think that's a different problem. Maybe that answers one of my other questions though when I suggested printing the output as "> max int" - https://reviews.llvm.org/D109217#3012175 - is that what you meant in this comment? That the value in the JSON data doesn't have any mention of the warning/overflow, and only has the saturated integer value? I worry that's error-prone though, since the value is incorrect and a tool might not be aware of that. So I think it may be valuable to ensure we don't encode a valid value/something that could be mistaken for a valid value in the field when it's overflowed? JSON supports the value being a string, I assume - so perhaps a string representation of ">= max int" or "overflowed" or something would be suitable?
llvm/tools/llvm-dwarfdump/Statistics.cpp
51–56	Maybe just implement this in terms of the other:

djtodoro added inline comments.Oct 7 2021, 1:18 AM

llvm/test/tools/llvm-dwarfdump/X86/locstats-bytes-overflow.yaml
25	Yep, that was my concern. It was going to stderr, but the JSON data goes to stdout. JSON supports the value being a string, I assume - so perhaps a string representation of ">= max int" or "overflowed" or something would be suitable? Hmm, I agree... using some special value will make this output ready for the tools from outside. I'll update that.
llvm/tools/llvm-dwarfdump/Statistics.cpp
51–56	yes, thanks

introduce "overflow" special stats value for the fields that overflow

Harbormaster completed remote builds in B127455: Diff 377759.Oct 7 2021, 3:52 AM

This looks alright to me - though I'll leave it to toher folks with more statistics interest to provide final approval.

(@cmtice @rdhindsa - something to be aware of that might crop-up in google uses of the statistics infrastructure, I'd expect)

In D109217#3048505, @dblaikie wrote:

This looks alright to me - though I'll leave it to toher folks with more statistics interest to provide final approval.

(@cmtice @rdhindsa - something to be aware of that might crop-up in google uses of the statistics infrastructure, I'd expect)

Sure, thanks for your comments!

djtodoro added a reviewer: aprantl.Oct 7 2021, 1:39 PM

I think this is reasonable, out of curiosity, would there be a benefit to using APInt? Probably not because 64 bits is already huge...

In D109217#3049369, @aprantl wrote:

I think this is reasonable, out of curiosity, would there be a benefit to using APInt? Probably not because 64 bits is already huge...

I guess we can use APInt as well, but the 64 bits seem enough for now. A challenge would be to teach all the front-ends how to parse the big numbers (>64bits), but it is achievable.

Ok. Works for me.

llvm/tools/llvm-dwarfdump/Statistics.cpp
53	https://en.cppreference.com/w/cpp/types/numeric_limits/max ?

This revision is now accepted and ready to land.Oct 11 2021, 10:01 AM

Thanks!

-use the std::numeric_limits<uint64_t>::max()

Harbormaster completed remote builds in B128288: Diff 378908.Oct 12 2021, 1:43 AM

djtodoro mentioned this in rG8c3adce81dc3: [JSON] Handle uint64_t type.Oct 15 2021, 2:19 AM

Closed by commit rGc450e47a8c2d: [llvm-dwarfdump] Fix unsigned overflow when calculating stats (authored by djtodoro). · Explain WhyOct 15 2021, 3:16 AM

This revision was automatically updated to reflect the committed changes.

djtodoro added a commit: rGc450e47a8c2d: [llvm-dwarfdump] Fix unsigned overflow when calculating stats.

Revision Contents

Path

Size

llvm/

test/

tools/

llvm-dwarfdump/

X86/

locstats-big-number-of-bytes.yaml

92 lines

locstats-bytes-overflow.yaml

91 lines

locstats-for-absctract-origin-vars.yaml

2 lines

statistics-dwo.test

2 lines

statistics-v3.test

2 lines

statistics.ll

2 lines

stats-scope-bytes-covered.yaml

2 lines

tools/

llvm-dwarfdump/

Statistics.cpp

205 lines

Diff 373193

llvm/test/tools/llvm-dwarfdump/X86/locstats-big-number-of-bytes.yaml

This file was added.

				# RUN: yaml2obj %s \| llvm-dwarfdump --statistics - \| FileCheck %s

				## Check that we are covering the situation when
				## sum of bytes in scope is a huge (uint64_t) number.
				##
				## The yaml represents this DWARF:
				##
				## DW_TAG_compile_unit
				## DW_AT_low_pc (0x0000000000000000)
				## DW_AT_high_pc (0x000000000000000b)
				##
				## DW_TAG_subprogram
				## DW_AT_low_pc (0x0000000000000000)
				## DW_AT_high_pc (0x00000000ffffffff)
				## DW_TAG_variable
				## DW_AT_location (0x00000023:
				## [0x0000000000000003, 0x0000000000000005): DW_OP_reg2 RCX)
				## DW_TAG_subprogram
				## DW_AT_low_pc (0x0000000000000000)
				## DW_AT_high_pc (0x00000000ffffffff)
				## DW_TAG_variable
				## DW_AT_location (0x00000023:
				## [0x0000000000000003, 0x0000000000000005): DW_OP_reg2 RCX)

				# CHECK: "version": 9,
				# CHECK: "sum_all_variables(#bytes in parent scope)": 8589934590

				--- !ELF
				FileHeader:
				Class: ELFCLASS64
				Data: ELFDATA2LSB
				Type: ET_EXEC
				Machine: EM_X86_64
				Sections:
				- Name: .debug_loc
				Type: SHT_PROGBITS
				AddressAlign: 0x01
				Content: '00000000000000000600000000000000010055000000000000000000000000000000000300000000000000050000000000000001005200000000000000000000000000000000'
				- Name: .debug_ranges
				Type: SHT_PROGBITS
				AddressAlign: 0x01
				Content: '000000000000000003000000000000000500000000000000080000000000000000000000000000000000000000000000'
				DWARF:
				debug_abbrev:
				- Table:
				- Code: 1
				Tag: DW_TAG_compile_unit
				Children: DW_CHILDREN_yes
				Attributes:
				- Attribute: DW_AT_low_pc
				Form: DW_FORM_addr
				- Attribute: DW_AT_high_pc
				Form: DW_FORM_data4
				- Code: 2
				Tag: DW_TAG_subprogram
				Children: DW_CHILDREN_yes
				Attributes:
				- Attribute: DW_AT_low_pc
				Form: DW_FORM_addr
				- Attribute: DW_AT_high_pc
				Form: DW_FORM_data4
				- Code: 3
				Tag: DW_TAG_variable
				Children: DW_CHILDREN_no
				Attributes:
				- Attribute: DW_AT_location
				Form: DW_FORM_sec_offset
				debug_info:
				- Version: 4
				AbbrOffset: 0x00
				Entries:
				- AbbrCode: 1 ## DW_TAG_compile_unit
				Values:
				- Value: 0x00 ## DW_AT_low_pc
				- Value: 0x0b ## DW_AT_high_pc
				- AbbrCode: 2 ## DW_TAG_subprogram
				Values:
				- Value: 0x00 ## DW_AT_low_pc
				- Value: 0xFFFFFFFF ## DW_AT_high_pc
				- AbbrCode: 3 ## DW_TAG_variable
				Values:
				- Value: 0x23 ## DW_AT_sec_offset
				- AbbrCode: 0 ## NULL
				- AbbrCode: 2 ## DW_TAG_subprogram
				Values:
				- Value: 0x00 ## DW_AT_low_pc
				- Value: 0xFFFFFFFF ## DW_AT_high_pc
				- AbbrCode: 3 ## DW_TAG_variable
				Values:
				- Value: 0x23 ## DW_AT_sec_offset
				- AbbrCode: 0 ## NULL
				- AbbrCode: 0 ## NULL

llvm/test/tools/llvm-dwarfdump/X86/locstats-bytes-overflow.yaml

This file was added.

				# REQUIRES: asserts
				# RUN: yaml2obj %s \| not --crash llvm-dwarfdump --statistics - -o /dev/null %s 2>&1 \| FileCheck %s

				## Check that we are covering the situation when a stat field overflows.
				##
				## The yaml represents this DWARF:
				##
				## DW_TAG_compile_unit
				## DW_AT_low_pc (0x0000000000000000)
				## DW_AT_high_pc (0x000000000000000b)
				##
				## DW_TAG_subprogram
				## DW_AT_low_pc (0x0000000000000000)
				## DW_AT_high_pc (0xffffffffffffffff)
				## DW_TAG_variable
				## DW_AT_location (0x00000023:
				## [0x0000000000000003, 0x0000000000000005): DW_OP_reg2 RCX)
				## DW_TAG_subprogram
				## DW_AT_low_pc (0x0000000000000000)
				## DW_AT_high_pc (0xffffffffffffffff)
				## DW_TAG_variable
				## DW_AT_location (0x00000023:
				## [0x0000000000000003, 0x0000000000000005): DW_OP_reg2 RCX)

				# CHECK: Stat field overflow
				dblaikieUnsubmitted Not Done Reply Inline Actions I think it'd be worth CHECKing the specific/full syntax, rather than just "this warning text appears somewhere in the output" - since we're specifically putting it in the output in a particular place. Hmm, that raises a question: is this warning going to stderr, but the actual stats output is to stdout? If so then I think that's a different problem. Maybe that answers one of my other questions though when I suggested printing the output as "> max int" - https://reviews.llvm.org/D109217#3012175 - is that what you meant in this comment? That the value in the JSON data doesn't have any mention of the warning/overflow, and only has the saturated integer value? I worry that's error-prone though, since the value is incorrect and a tool might not be aware of that. So I think it may be valuable to ensure we don't encode a valid value/something that could be mistaken for a valid value in the field when it's overflowed? JSON supports the value being a string, I assume - so perhaps a string representation of ">= max int" or "overflowed" or something would be suitable? dblaikie: I think it'd be worth CHECKing the specific/full syntax, rather than just "this warning text…
				djtodoroAuthorUnsubmitted Done Reply Inline Actions Yep, that was my concern. It was going to stderr, but the JSON data goes to stdout. JSON supports the value being a string, I assume - so perhaps a string representation of ">= max int" or "overflowed" or something would be suitable? Hmm, I agree... using some special value will make this output ready for the tools from outside. I'll update that. djtodoro: Yep, that was my concern. It was going to stderr, but the JSON data goes to stdout. >JSON…

				--- !ELF
				FileHeader:
				Class: ELFCLASS[[BITS=64]]
				Data: ELFDATA2LSB
				Type: ET_EXEC
				Machine: EM_X86_64
				Sections:
				- Name: .debug_loc
				Type: SHT_PROGBITS
				AddressAlign: 0x01
				Content: '00000000000000000600000000000000010055000000000000000000000000000000000300000000000000050000000000000001005200000000000000000000000000000000'
				- Name: .debug_ranges
				Type: SHT_PROGBITS
				AddressAlign: 0x01
				Content: '000000000000000003000000000000000500000000000000080000000000000000000000000000000000000000000000'
				DWARF:
				debug_abbrev:
				- Table:
				- Code: 1
				Tag: DW_TAG_compile_unit
				Children: DW_CHILDREN_yes
				Attributes:
				- Attribute: DW_AT_low_pc
				Form: DW_FORM_addr
				- Attribute: DW_AT_high_pc
				Form: DW_FORM_data8
				- Code: 2
				Tag: DW_TAG_subprogram
				Children: DW_CHILDREN_yes
				Attributes:
				- Attribute: DW_AT_low_pc
				Form: DW_FORM_addr
				- Attribute: DW_AT_high_pc
				Form: DW_FORM_data8
				- Code: 3
				Tag: DW_TAG_variable
				Children: DW_CHILDREN_no
				Attributes:
				- Attribute: DW_AT_location
				Form: DW_FORM_sec_offset
				debug_info:
				- Version: 4
				AbbrOffset: 0x00
				Entries:
				- AbbrCode: 1 ## DW_TAG_compile_unit
				Values:
				- Value: 0x00 ## DW_AT_low_pc
				- Value: 0x0b ## DW_AT_high_pc
				- AbbrCode: 2 ## DW_TAG_subprogram
				Values:
				- Value: 0x00 ## DW_AT_low_pc
				- Value: 0xFFFFFFFFFFFFFFFF ## DW_AT_high_pc
				- AbbrCode: 3 ## DW_TAG_variable
				Values:
				- Value: 0x23 ## DW_AT_sec_offset
				- AbbrCode: 0 ## NULL
				- AbbrCode: 2 ## DW_TAG_subprogram
				Values:
				- Value: 0x00 ## DW_AT_low_pc
				- Value: 0xFFFFFFFFFFFFFFFF ## DW_AT_high_pc
				- AbbrCode: 3 ## DW_TAG_variable
				Values:
				- Value: 0x23 ## DW_AT_sec_offset
				- AbbrCode: 0 ## NULL
				- AbbrCode: 0 ## NULL

llvm/test/tools/llvm-dwarfdump/X86/locstats-for-absctract-origin-vars.yaml

	Show First 20 Lines • Show All 114 Lines • ▼ Show 20 Lines
	## DW_AT_decl_line (1)			## DW_AT_decl_line (1)
	## DW_TAG_lexical_block			## DW_TAG_lexical_block
	## DW_TAG_variable <--(0x000000f8)			## DW_TAG_variable <--(0x000000f8)
	## DW_AT_decl_file (0x01)			## DW_AT_decl_file (0x01)
	## DW_AT_decl_line (1)			## DW_AT_decl_line (1)
	## DW_TAG_subprogram			## DW_TAG_subprogram
	## DW_AT_abstract_origin (0x000000f0)			## DW_AT_abstract_origin (0x000000f0)

	# CHECK: "version": 8,			# CHECK: "version": 9,
	# CHECK: "#variables processed by location statistics": 17,			# CHECK: "#variables processed by location statistics": 17,
	# CHECK: "#variables with 0% of parent scope covered by DW_AT_location": 13,			# CHECK: "#variables with 0% of parent scope covered by DW_AT_location": 13,
	# CHECK: "#variables with 100% of parent scope covered by DW_AT_location": 4,			# CHECK: "#variables with 100% of parent scope covered by DW_AT_location": 4,

	--- !ELF			--- !ELF
	FileHeader:			FileHeader:
	Class: ELFCLASS64			Class: ELFCLASS64
	Data: ELFDATA2LSB			Data: ELFDATA2LSB
	▲ Show 20 Lines • Show All 292 Lines • Show Last 20 Lines

llvm/test/tools/llvm-dwarfdump/X86/statistics-dwo.test

	Show First 20 Lines • Show All 63 Lines • ▼ Show 20 Lines
	# printf ("fibonacci(9) = %d\n", result);			# printf ("fibonacci(9) = %d\n", result);
	# result = fib(10);			# result = fib(10);
	# printf ("fibonacci(10) = %d\n", result);			# printf ("fibonacci(10) = %d\n", result);
	#			#
	# return 0;			# return 0;
	# }			# }
	#			#

	CHECK: "version": 8,			CHECK: "version": 9,
	CHECK: "#functions": 3,			CHECK: "#functions": 3,
	CHECK: "#functions with location": 3,			CHECK: "#functions with location": 3,
	CHECK: "#inlined functions": 7,			CHECK: "#inlined functions": 7,
	CHECK: "#inlined functions with abstract origins": 7,			CHECK: "#inlined functions with abstract origins": 7,
	CHECK: "#unique source variables": 9,			CHECK: "#unique source variables": 9,
	CHECK: "#source variables": 30,			CHECK: "#source variables": 30,

	# Ideally the value below would be 33 but currently it's not.			# Ideally the value below would be 33 but currently it's not.
	Show All 16 Lines

llvm/test/tools/llvm-dwarfdump/X86/statistics-v3.test

	Show First 20 Lines • Show All 58 Lines • ▼ Show 20 Lines
	# printf ("fibonacci(9) = %d\n", result);			# printf ("fibonacci(9) = %d\n", result);
	# result = fib(10);			# result = fib(10);
	# printf ("fibonacci(10) = %d\n", result);			# printf ("fibonacci(10) = %d\n", result);
	#			#
	# return 0;			# return 0;
	# }			# }
	#			#

	CHECK: "version": 8,			CHECK: "version": 9,
	CHECK: "#functions": 3,			CHECK: "#functions": 3,
	CHECK: "#functions with location": 3,			CHECK: "#functions with location": 3,
	CHECK: "#inlined functions": 8,			CHECK: "#inlined functions": 8,
	CHECK: "#inlined functions with abstract origins": 8,			CHECK: "#inlined functions with abstract origins": 8,
	CHECK: "#unique source variables": 9,			CHECK: "#unique source variables": 9,
	CHECK: "#source variables": 33,			CHECK: "#source variables": 33,

	# Ideally the value below would be 33 but currently it's not.			# Ideally the value below would be 33 but currently it's not.
	Show All 16 Lines

llvm/test/tools/llvm-dwarfdump/X86/statistics.ll

	; RUN: llc -O0 %s -o - -filetype=obj \			; RUN: llc -O0 %s -o - -filetype=obj \
	; RUN: \| llvm-dwarfdump -statistics - \| FileCheck %s			; RUN: \| llvm-dwarfdump -statistics - \| FileCheck %s
	; CHECK: "version": 8,			; CHECK: "version": 9,

	; namespace test {			; namespace test {
	; extern int a;			; extern int a;
	; }			; }
	; using test::a;			; using test::a;
	;			;
	; int GlobalConst = 42;			; int GlobalConst = 42;
	; int Global;			; int Global;
	▲ Show 20 Lines • Show All 210 Lines • Show Last 20 Lines

llvm/test/tools/llvm-dwarfdump/X86/stats-scope-bytes-covered.yaml

	Show All 27 Lines
	##			##
	## // #bytes in parent scope: 6			## // #bytes in parent scope: 6
	## // #bytes in any scope covered by DW_AT_location: 2			## // #bytes in any scope covered by DW_AT_location: 2
	## // #bytes in parent scope covered by DW_AT_location: 0			## // #bytes in parent scope covered by DW_AT_location: 0
	## DW_TAG_variable			## DW_TAG_variable
	## DW_AT_location (0x00000023:			## DW_AT_location (0x00000023:
	## [0x0000000000000003, 0x0000000000000005): DW_OP_reg2 RCX)			## [0x0000000000000003, 0x0000000000000005): DW_OP_reg2 RCX)

	# CHECK: "version": 8,			# CHECK: "version": 9,
	# CHECK: "sum_all_variables(#bytes in parent scope)": 12,			# CHECK: "sum_all_variables(#bytes in parent scope)": 12,
	# CHECK: "sum_all_variables(#bytes in any scope covered by DW_AT_location)": 8			# CHECK: "sum_all_variables(#bytes in any scope covered by DW_AT_location)": 8
	# CHECK: "sum_all_variables(#bytes in parent scope covered by DW_AT_location)": 4			# CHECK: "sum_all_variables(#bytes in parent scope covered by DW_AT_location)": 4

	--- !ELF			--- !ELF
	FileHeader:			FileHeader:
	Class: ELFCLASS64			Class: ELFCLASS64
	Data: ELFDATA2LSB			Data: ELFDATA2LSB
	▲ Show 20 Lines • Show All 66 Lines • Show Last 20 Lines

llvm/tools/llvm-dwarfdump/Statistics.cpp

Show All 30 Lines

/// This represents variables DIE offsets. /// This represents variables DIE offsets.

using AbstractOriginVarsTy = llvm::SmallVector<uint64_t>; using AbstractOriginVarsTy = llvm::SmallVector<uint64_t>;

/// This maps function DIE offset to its variables. /// This maps function DIE offset to its variables.

using AbstractOriginVarsTyMap = llvm::DenseMap<uint64_t, AbstractOriginVarsTy>; using AbstractOriginVarsTyMap = llvm::DenseMap<uint64_t, AbstractOriginVarsTy>;

/// This represents function DIE offsets containing an abstract_origin. /// This represents function DIE offsets containing an abstract_origin.

using FunctionsWithAbstractOriginTy = llvm::SmallVector<uint64_t>; using FunctionsWithAbstractOriginTy = llvm::SmallVector<uint64_t>;

/// This represents a data type for the stats and it helps us to

/// detect an overflow.

/// NOTE: This can be implemented as a template if there is an another type

/// needing this.

struct SaturatingUINT64 {

/// Number that represents the stats.

uint64_t Value;

SaturatingUINT64(uint64_t Value_) : Value(Value_) {}

Lint: Pre-merge checks

clang-tidy: warning: invalid case style for parameter 'Value_' [readability-identifier-naming]
not useful

Lint: Pre-merge checks: clang-tidy: warning: invalid case style for parameter 'Value_' [readability-identifier-naming]…

void operator++(int) {

if (Value != UINT64_MAX) {

if (Value < UINT64_MAX - 1) {

++Value;

} else {

aprantlUnsubmitted

Done

https://en.cppreference.com/w/cpp/types/numeric_limits/max ?

aprantl: https://en.cppreference.com/w/cpp/types/numeric_limits/max ?

WithColor::warning(errs(), "llvm-dwarfdump --statistics")

<< "A stat field overflow.\n";

Value = UINT64_MAX;

dblaikieUnsubmitted

Not Done

void operator++(int) {

- if (Value != UINT64_MAX) {

- if (Value < UINT64_MAX - 1)

- ++Value;

- else

- Value = UINT64_MAX;

- }

+ *this += 1;

}

void operator+=(uint64_t Value_) {

Maybe just implement this in terms of the other:

dblaikie: Maybe just implement this in terms of the other:

djtodoroAuthorUnsubmitted

Done

yes, thanks

djtodoro: yes, thanks

}

assert(Value != UINT64_MAX && "Stat field overflow");

}

void operator+=(uint64_t Value_) {

Lint: Pre-merge checks

clang-tidy: warning: invalid case style for parameter 'Value_' [readability-identifier-naming]
not useful

Lint: Pre-merge checks: clang-tidy: warning: invalid case style for parameter 'Value_' [readability-identifier-naming]…

if (Value != UINT64_MAX) {

if (Value < UINT64_MAX - Value_)

Value += Value_;

else {

WithColor::warning(errs(), "llvm-dwarfdump --statistics")

<< "A stat field overflow.\n";

Value = UINT64_MAX;

}

assert(Value != UINT64_MAX && "Stat field overflow");

dblaikieUnsubmitted

Not Done

Personally, I think once we've defined the behavior on overflow, we shouldn't assert that overflow doesn't happen - this undermines the concept of having defined behavior on overflow (& makes it somewhat harder to test - since that behavior can now only be tested in a non-asserts build (well, I guess since assert isn't UB-if-false, and the assert is after the warning, that's not the case, but it's a bit subtle)).

Also, might it make more sense to do the warning/etc on the final use/printing out of the statistic instead? (I guess that's difficult because some statistics are derived from others? - so catching it the moment it overflows means it'll always be diagnosed only once, rather than multiple times due to multiple uses?)

dblaikie: Personally, I think once we've defined the behavior on overflow, we shouldn't assert that…

djtodoroAuthorUnsubmitted

Done

Thanks for the suggestions. I totally agree.

I guess that it makes sense to report the warning when printing the overflowed value, since we can point to the specific field.

djtodoro: Thanks for the suggestions. I totally agree. I guess that it makes sense to report the warning…

}

};

/// Holds statistics for one function (or other entity that has a PC range and /// Holds statistics for one function (or other entity that has a PC range and

/// contains variables, such as a compile unit). /// contains variables, such as a compile unit).

struct PerFunctionStats { struct PerFunctionStats {

/// Number of inlined instances of this function. /// Number of inlined instances of this function.

unsigned NumFnInlined = 0; uint64_t NumFnInlined = 0;

/// Number of out-of-line instances of this function. /// Number of out-of-line instances of this function.

unsigned NumFnOutOfLine = 0; uint64_t NumFnOutOfLine = 0;

/// Number of inlined instances that have abstract origins. /// Number of inlined instances that have abstract origins.

unsigned NumAbstractOrigins = 0; uint64_t NumAbstractOrigins = 0;

/// Number of variables and parameters with location across all inlined /// Number of variables and parameters with location across all inlined

/// instances. /// instances.

unsigned TotalVarWithLoc = 0; uint64_t TotalVarWithLoc = 0;

/// Number of constants with location across all inlined instances. /// Number of constants with location across all inlined instances.

unsigned ConstantMembers = 0; uint64_t ConstantMembers = 0;

/// Number of arificial variables, parameters or members across all instances. /// Number of arificial variables, parameters or members across all instances.

unsigned NumArtificial = 0; uint64_t NumArtificial = 0;

/// List of all Variables and parameters in this function. /// List of all Variables and parameters in this function.

StringSet<> VarsInFunction; StringSet<> VarsInFunction;

/// Compile units also cover a PC range, but have this flag set to false. /// Compile units also cover a PC range, but have this flag set to false.

bool IsFunction = false; bool IsFunction = false;

/// Function has source location information. /// Function has source location information.

bool HasSourceLocation = false; bool HasSourceLocation = false;

/// Number of function parameters. /// Number of function parameters.

unsigned NumParams = 0; uint64_t NumParams = 0;

/// Number of function parameters with source location. /// Number of function parameters with source location.

unsigned NumParamSourceLocations = 0; uint64_t NumParamSourceLocations = 0;

/// Number of function parameters with type. /// Number of function parameters with type.

unsigned NumParamTypes = 0; uint64_t NumParamTypes = 0;

/// Number of function parameters with a DW_AT_location. /// Number of function parameters with a DW_AT_location.

unsigned NumParamLocations = 0; uint64_t NumParamLocations = 0;

/// Number of local variables. /// Number of local variables.

unsigned NumLocalVars = 0; uint64_t NumLocalVars = 0;

/// Number of local variables with source location. /// Number of local variables with source location.

unsigned NumLocalVarSourceLocations = 0; uint64_t NumLocalVarSourceLocations = 0;

/// Number of local variables with type. /// Number of local variables with type.

unsigned NumLocalVarTypes = 0; uint64_t NumLocalVarTypes = 0;

/// Number of local variables with DW_AT_location. /// Number of local variables with DW_AT_location.

unsigned NumLocalVarLocations = 0; uint64_t NumLocalVarLocations = 0;

}; };

/// Holds accumulated global statistics about DIEs. /// Holds accumulated global statistics about DIEs.

struct GlobalStats { struct GlobalStats {

/// Total number of PC range bytes covered by DW_AT_locations. /// Total number of PC range bytes covered by DW_AT_locations.

unsigned TotalBytesCovered = 0; SaturatingUINT64 TotalBytesCovered = 0;

/// Total number of parent DIE PC range bytes covered by DW_AT_Locations. /// Total number of parent DIE PC range bytes covered by DW_AT_Locations.

unsigned ScopeBytesCovered = 0; SaturatingUINT64 ScopeBytesCovered = 0;

/// Total number of PC range bytes in each variable's enclosing scope. /// Total number of PC range bytes in each variable's enclosing scope.

unsigned ScopeBytes = 0; SaturatingUINT64 ScopeBytes = 0;

/// Total number of PC range bytes covered by DW_AT_locations with /// Total number of PC range bytes covered by DW_AT_locations with

/// the debug entry values (DW_OP_entry_value). /// the debug entry values (DW_OP_entry_value).

unsigned ScopeEntryValueBytesCovered = 0; SaturatingUINT64 ScopeEntryValueBytesCovered = 0;

/// Total number of PC range bytes covered by DW_AT_locations of /// Total number of PC range bytes covered by DW_AT_locations of

/// formal parameters. /// formal parameters.

unsigned ParamScopeBytesCovered = 0; SaturatingUINT64 ParamScopeBytesCovered = 0;

/// Total number of PC range bytes in each parameter's enclosing scope. /// Total number of PC range bytes in each parameter's enclosing scope.

unsigned ParamScopeBytes = 0; SaturatingUINT64 ParamScopeBytes = 0;

/// Total number of PC range bytes covered by DW_AT_locations with /// Total number of PC range bytes covered by DW_AT_locations with

/// the debug entry values (DW_OP_entry_value) (only for parameters). /// the debug entry values (DW_OP_entry_value) (only for parameters).

unsigned ParamScopeEntryValueBytesCovered = 0; SaturatingUINT64 ParamScopeEntryValueBytesCovered = 0;

/// Total number of PC range bytes covered by DW_AT_locations (only for local /// Total number of PC range bytes covered by DW_AT_locations (only for local

/// variables). /// variables).

unsigned LocalVarScopeBytesCovered = 0; SaturatingUINT64 LocalVarScopeBytesCovered = 0;

/// Total number of PC range bytes in each local variable's enclosing scope. /// Total number of PC range bytes in each local variable's enclosing scope.

unsigned LocalVarScopeBytes = 0; SaturatingUINT64 LocalVarScopeBytes = 0;

/// Total number of PC range bytes covered by DW_AT_locations with /// Total number of PC range bytes covered by DW_AT_locations with

/// the debug entry values (DW_OP_entry_value) (only for local variables). /// the debug entry values (DW_OP_entry_value) (only for local variables).

unsigned LocalVarScopeEntryValueBytesCovered = 0; SaturatingUINT64 LocalVarScopeEntryValueBytesCovered = 0;

/// Total number of call site entries (DW_AT_call_file & DW_AT_call_line). /// Total number of call site entries (DW_AT_call_file & DW_AT_call_line).

unsigned CallSiteEntries = 0; SaturatingUINT64 CallSiteEntries = 0;

/// Total number of call site DIEs (DW_TAG_call_site). /// Total number of call site DIEs (DW_TAG_call_site).

unsigned CallSiteDIEs = 0; SaturatingUINT64 CallSiteDIEs = 0;

/// Total number of call site parameter DIEs (DW_TAG_call_site_parameter). /// Total number of call site parameter DIEs (DW_TAG_call_site_parameter).

unsigned CallSiteParamDIEs = 0; SaturatingUINT64 CallSiteParamDIEs = 0;

/// Total byte size of concrete functions. This byte size includes /// Total byte size of concrete functions. This byte size includes

/// inline functions contained in the concrete functions. /// inline functions contained in the concrete functions.

unsigned FunctionSize = 0; SaturatingUINT64 FunctionSize = 0;

/// Total byte size of inlined functions. This is the total number of bytes /// Total byte size of inlined functions. This is the total number of bytes

/// for the top inline functions within concrete functions. This can help /// for the top inline functions within concrete functions. This can help

/// tune the inline settings when compiling to match user expectations. /// tune the inline settings when compiling to match user expectations.

unsigned InlineFunctionSize = 0; SaturatingUINT64 InlineFunctionSize = 0;

}; };

/// Holds accumulated debug location statistics about local variables and /// Holds accumulated debug location statistics about local variables and

/// formal parameters. /// formal parameters.

struct LocationStats { struct LocationStats {

/// Map the scope coverage decile to the number of variables in the decile. /// Map the scope coverage decile to the number of variables in the decile.

/// The first element of the array (at the index zero) represents the number /// The first element of the array (at the index zero) represents the number

/// of variables with the no debug location at all, but the last element /// of variables with the no debug location at all, but the last element

/// in the vector represents the number of fully covered variables within /// in the vector represents the number of fully covered variables within

/// its scope. /// its scope.

std::vector<unsigned> VarParamLocStats{ std::vector<uint64_t> VarParamLocStats{