This is an archive of the discontinued LLVM Phabricator instance.

llvm-dwarfdump: Improve/fix pretty printing of array dimensions
ClosedPublic

Authored by dblaikie on Dec 14 2018, 2:21 PM.

Details

Summary

This is to address post-commit feedback from Paul Robinson on r348954.

The original commit misinterprets count and upper bound as the same thing (I thought I saw GCC producing an upper bound the same as Clang's count, but GCC correctly produces an upper bound that's one less than the count (in C, that is, where arrays are zero indexed)).

I want to preserve the C-like output for the common case, so in the absence of a lower bound the count (or one greater than the upper bound) is rendered between []. In the trickier cases, where a lower bound is specified, a half-open range is used (eg: lower bound 1, count 2 would be "[1, 3)" and an unknown parts use a '?' (eg: "[1, ?)" or "[?, 7)" or "[?, ? + 3)").

Diff Detail

Event Timeline

dblaikie created this revision.Dec 14 2018, 2:21 PM

Note that the lower-bound might be implicit, but it's never "unknown."
We do need to account for the cases where the implicit lower-bound is 1. The Fortran people will need this, and the OpenVMS folks support several other languages that also have a default LB of 1.

Using [C] (C = count) for both the count-only and upper-bound-only cases suggests that we're being descriptive rather than carefully reflecting the attributes that are in the subrange DIE, which is fine. So, if an explicit LB matches the language default, then the display should also be [C] (where C = count if we have a count, or C = U + (1 - L) if we have upper_bound).

We'd show two values [L,U] or [L,C) only if L was specified and non-default. If we don't have either U or C, we get [L,?).

How does that sound?

Note that the lower-bound might be implicit, but it's never "unknown."
We do need to account for the cases where the implicit lower-bound is 1. The Fortran people will need this, and the OpenVMS folks support several other languages that also have a default LB of 1.

Using [C] (C = count) for both the count-only and upper-bound-only cases suggests that we're being descriptive rather than carefully reflecting the attributes that are in the subrange DIE, which is fine. So, if an explicit LB matches the language default, then the display should also be [C] (where C = count if we have a count, or C = U + (1 - L) if we have upper_bound).

The only change you're suggesting is to detect when an explicit LB matches the language default, and treating that case the same as if the LB were not specified (in terms of how it's printed out)?

I'd say it's a nice improvement, but not a requirement & not sure the balance of encoding the lower bound defaults table from the DWARF spec is worth that improvement in printing? But I don't feel super strongly either way.

We'd show two values [L,U] or [L,C) only if L was specified and non-default. If we don't have either U or C, we get [L,?).

How does that sound?

Note that the lower-bound might be implicit, but it's never "unknown."
We do need to account for the cases where the implicit lower-bound is 1. The Fortran people will need this, and the OpenVMS folks support several other languages that also have a default LB of 1.

Using [C] (C = count) for both the count-only and upper-bound-only cases suggests that we're being descriptive rather than carefully reflecting the attributes that are in the subrange DIE, which is fine. So, if an explicit LB matches the language default, then the display should also be [C] (where C = count if we have a count, or C = U + (1 - L) if we have upper_bound).

The only change you're suggesting is to detect when an explicit LB matches the language default, and treating that case the same as if the LB were not specified (in terms of how it's printed out)?

Sorry, no, in the course of editing the comment I lost the important part: When converting UB-only to Count, you need to correctly factor in the default LB: C = UB + (1 - DefaultLBForLanguage) otherwise the display for 1-based languages will be incorrect.

Then, since you have to know the default LB anyway, the rest seems not too hard.

dblaikie updated this revision to Diff 178692.Dec 18 2018, 8:39 AM

Use language default lower bound to inform size calculation from upper_bound with no lower bound
Use [size] rendering when explicit lower bound matches language default

Note that the lower-bound might be implicit, but it's never "unknown."
We do need to account for the cases where the implicit lower-bound is 1. The Fortran people will need this, and the OpenVMS folks support several other languages that also have a default LB of 1.

Using [C] (C = count) for both the count-only and upper-bound-only cases suggests that we're being descriptive rather than carefully reflecting the attributes that are in the subrange DIE, which is fine. So, if an explicit LB matches the language default, then the display should also be [C] (where C = count if we have a count, or C = U + (1 - L) if we have upper_bound).

The only change you're suggesting is to detect when an explicit LB matches the language default, and treating that case the same as if the LB were not specified (in terms of how it's printed out)?

Sorry, no, in the course of editing the comment I lost the important part: When converting UB-only to Count, you need to correctly factor in the default LB: C = UB + (1 - DefaultLBForLanguage) otherwise the display for 1-based languages will be incorrect.

Then, since you have to know the default LB anyway, the rest seems not too hard.

Ah, fair enough - done those two things (omitting the lower bound when an explicit lower bound matches the language default & using the lower bound to accurately compute the size for [size] rendering).

I chose to use half open ranges always (even in the case where the input was lower/upper_bound, not lower_bound+count) to avoid visual ambiguity of "int[[x, y]]" - it's not clear what those x and y mean, why there are double squares around them, etc. Whereas "int[[x, y)]" is hopefully a bit more clear.

Cool, I find these thing really helpful when looking at DWARF output so I'm excited to see them getting more advanced.

lib/DebugInfo/DWARF/DWARFDie.cpp
249–250

nit: Could we maybe extract this into a static dumpArray helper?

dblaikie updated this revision to Diff 178747.Dec 18 2018, 11:48 AM

Pull out array printing into a helper function

A couple small things. If Jonas is happy, I'm happy.

lib/DebugInfo/DWARF/DWARFDie.cpp
157

dumpArrayType as it's not actually dumping an array of anything

180

else if here? Otherwise it looks like the no-attributes-at-all case would dump
[][[?,?)] which seems like overkill.

dblaikie updated this revision to Diff 178790.Dec 18 2018, 2:36 PM
dblaikie marked an inline comment as done.

Rename dumpArray to dumpArrayType
Implement & test behavior for subrange without any attributes (no low/high/count).

dblaikie marked 3 inline comments as done.Dec 18 2018, 2:36 PM
dblaikie added inline comments.
lib/DebugInfo/DWARF/DWARFDie.cpp
180

Ah, thanks - good catch. Added a test & fixed the issue.

249–250

Sure - done!

JDevlieghere accepted this revision.Dec 18 2018, 3:22 PM

Looks like everybody's happy! LGTM

This revision is now accepted and ready to land.Dec 18 2018, 3:22 PM
This revision was automatically updated to reflect the committed changes.
dblaikie marked an inline comment as done.