This is an archive of the discontinued LLVM Phabricator instance.

[dwarfdump] Pretty print location expressions and location lists
ClosedPublic

Authored by rnk on Aug 24 2017, 3:34 PM.

Details

Summary

Based on Fred's patch here: https://reviews.llvm.org/D6771

I can't seem to commandeer the old review, so I'm creating a new one.

With that change the locations exrpessions are pretty printed inline in the
DIE tree. The output looks like this for debug_loc entries:

DW_AT_location [DW_FORM_data4]        (0x00000000
   0x0000000000000001 - 0x000000000000000b: OP_consts +3
   0x000000000000000b - 0x0000000000000012: OP_consts +7
   0x0000000000000012 - 0x000000000000001b: OP_reg0 RAX, OP_piece 0x00000004
   0x000000000000001b - 0x0000000000000024: OP_breg5 RDI+0)

And like this for debug_loc.dwo entries:

DW_AT_location [DW_FORM_sec_offset]   (0x00000000
  Addr idx 2 (w/ length 190): OP_consts +0, OP_stack_value
  Addr idx 3 (w/ length 23): OP_reg0 RAX, OP_piece 0x00000004)

Simple locations without ranges are printed inline:

DW_AT_location [DW_FORM_block1]       (OP_reg4 RSI, OP_piece 0x00000004, OP_bit_piece 0x00000020 0x00000000)

The debug_loc(.dwo) dumping in changed accordingly to factor the code.

Diff Detail

Repository
rL LLVM

Event Timeline

rnk created this revision.Aug 24 2017, 3:34 PM
rnk added a comment.Aug 24 2017, 3:36 PM

There are 72 remaining test failures: https://reviews.llvm.org/P8028

The printing isn't pretty enough yet, IMO, but I spent a few hours rebasing this so I figured I would post the patch for people to try it out.

aprantl added inline comments.Aug 24 2017, 4:59 PM
llvm/include/llvm/DebugInfo/DWARF/DWARFExpression.h
91 ↗(On Diff #112622)

We use autobrief in our Doxygen configuration.

llvm/lib/DebugInfo/DWARF/DWARFExpression.cpp
106 ↗(On Diff #112622)

Does this assert on invalid DWARF? Or is this an internal consistency check?

aprantl edited edge metadata.Aug 24 2017, 8:14 PM

Thanks a lot for picking this up!

llvm/include/llvm/DebugInfo/DWARF/DWARFExpression.h
86 ↗(On Diff #112622)

extract

88 ↗(On Diff #112622)

print

116 ↗(On Diff #112622)

are provided

llvm/include/llvm/DebugInfo/DWARFExpression.h
80 ↗(On Diff #112622)

extract

82 ↗(On Diff #112622)

print

There are two copies of DWARFExpression.h?

Aesthetically the parens around the output location list look weird. But if that's one at some higher level I guess we can live with it.

rnk marked 6 inline comments as done.Aug 25 2017, 1:36 PM
rnk added inline comments.
llvm/include/llvm/DebugInfo/DWARFExpression.h
80 ↗(On Diff #112622)

Hm, this appears to be a duplicate file from the rebase. I've removed it.

llvm/lib/DebugInfo/DWARF/DWARFExpression.cpp
25 ↗(On Diff #112622)

Maybe this could be an ArrayRef.

106 ↗(On Diff #112622)

Descriptions actually has 256 entries, and the opcodes are one byte, so this can't fail as written.

rnk updated this revision to Diff 112744.Aug 25 2017, 2:03 PM
rnk marked an inline comment as done.
  • updated, 25 test failures remaining
dblaikie edited edge metadata.Aug 25 2017, 2:56 PM

(Adrian: Reckon it's worth dropping the "OP_" prefix too, to simplify? (or bump it up to "DW_OP_" and then maybe have the brief version drop "DW_X_" prefixes across the board (since they're unambiguous?)

llvm/include/llvm/DebugInfo/DWARF/DWARFDebugLoc.h
72 ↗(On Diff #112744)

Worth switching this from 'unsigned char' to 'char' to avoid the conversions back and forth on reading/writing? (more compatible with DataExtractor, etc)

llvm/include/llvm/DebugInfo/DWARF/DWARFExpression.h
125 ↗(On Diff #112744)

What's the point of this function? Since the DWARFExpression is already a range it seems strange to have a range accessor on top of that. It's used in one place where "operations()" could be replaced with "*this" I think?

llvm/lib/DebugInfo/DWARF/DWARFContext.cpp
870–871 ↗(On Diff #112744)

Worth fixing sooner rather than later? (ie: before this is committed) - maybe a callback for warnings here?

(in addition to racing, it's not re-usable - multiple uses of libDebugInfo in a single process would see this only once which seems unfortunate)

llvm/lib/DebugInfo/DWARF/DWARFDebugLoc.cpp
32–33 ↗(On Diff #112744)

Formatting (I guess there should be spaces around the '+')

32–33 ↗(On Diff #112744)

This handling of the base address could probably be a separate patch & would justify plumbing through the DWARFUnit, etc, simplifying this patch that needs it for the expression dumping as well.

llvm/lib/DebugInfo/DWARF/DWARFDie.cpp
104 ↗(On Diff #112744)

Prefer !empty over size, I think. (similarly below)

llvm/lib/DebugInfo/DWARF/DWARFExpression.cpp
32–33 ↗(On Diff #112744)

This reinitializes (possibly racing) the vector contents every time it's called? Maybe use {} init instead?

(I see it's static and only called once, but still a bit weird - I'd probably just drop this function and use {} init in the function below that needs it)

rnk updated this revision to Diff 112754.Aug 25 2017, 3:39 PM
  • fix all remaining tests
rnk marked 5 inline comments as done.Aug 28 2017, 9:47 AM

I think the remaining concerns are:

  1. Should we split the location list base address change into another patch
  2. Should we try to avoid one-time dynamic initialization for the DW_OP description table
  3. Should we remove the "OP_" prefix

In general, I'd like to defer anything that people feel can wait, so that I can minimize the amount of time that I'm carrying this patch locally. It has a high risk for conflicts vs. people adding new dwarf tests, and I've already put in a lot of work updating the test suite.

llvm/include/llvm/DebugInfo/DWARF/DWARFExpression.h
125 ↗(On Diff #112744)

Good question, removed.

llvm/lib/DebugInfo/DWARF/DWARFContext.cpp
870–871 ↗(On Diff #112744)

I made this an optional method that the creator can call to enable register name dumping. It returns an error. All clients that don't call it will just get dumps like "OP_reg5" instead of "OP_reg5 RDI", which isn't a big deal.

llvm/lib/DebugInfo/DWARF/DWARFExpression.cpp
32–33 ↗(On Diff #112744)

The problem with braced initialization is that not all 256 entries are filled, so it requires duplicating not just the order of the expression opcode entries, but also the values. I didn't like that. We could make the Descriptions vector not be static, and make the caller take the vector by value so it owns it.

I was able to rewrite this to return a constexpr array inside a struct wrapper, but MSVC didn't like it.

rnk updated this revision to Diff 112911.Aug 28 2017, 9:49 AM
rnk marked an inline comment as done.
  • Address David's comments
In D37123#854201, @rnk wrote:

I think the remaining concerns are:

  1. Should we split the location list base address change into another patch

Typically the answer to the question whether we should split up a patch is always yes, but I'm fine either way.

  1. Should we try to avoid one-time dynamic initialization for the DW_OP description table

Would it perhaps make sense to initialize the table from the macros in Dwarf.def?

  1. Should we remove the "OP_" prefix

I think we should implement two behaviors: When DumpOptions::Brief is set, we should print expressions with DW_OP_ stripped and registers in human-readable form, e.g.: rdi+42, deref, stack-value, otherwise it should print the full expression, e.g.: DW_OP_breg1, DW_OP_constu 42, DW_OP_deref, DW_OP_stack_value.

(And at some point soon, I'll upload a patch to make brief the default instead of verbose).

In general, I'd like to defer anything that people feel can wait, so that I can minimize the amount of time that I'm carrying this patch locally. It has a high risk for conflicts vs. people adding new dwarf tests, and I've already put in a lot of work updating the test suite.

Makes sense!

rnk added inline comments.Aug 28 2017, 12:59 PM
llvm/lib/DebugInfo/DWARF/DWARFContext.cpp
248–251 ↗(On Diff #112911)

When I tried factoring out the base address patch, I ran into this comment. How would we do this correctly? What scenario would cause us to pick the wrong compile unit? Picking the wrong compile unit would add the wrong base address to the .debug_loc entries.

rnk added inline comments.Aug 28 2017, 1:13 PM
llvm/lib/DebugInfo/DWARF/DWARFContext.cpp
248–251 ↗(On Diff #112911)

I've convinced myself that this code is wrong and we actually have a test in tree for this. It doesn't test the two-CU location list case, though, so it wasn't obvious to me that this was wrong.

I'll revert this change. Most of the value is in the expression printing, not the base address change.

rnk updated this revision to Diff 112973.Aug 28 2017, 2:45 PM
  • Revert CU base address adjustment from location list dumps
  • Remove all DWARFUnit threading
  1. Should we try to avoid one-time dynamic initialization for the DW_OP description table

Would it perhaps make sense to initialize the table from the macros in Dwarf.def?

+1, worth considering.

  1. Should we remove the "OP_" prefix

I think we should implement two behaviors: When DumpOptions::Brief is set, we should print expressions with DW_OP_ stripped and registers in human-readable form, e.g.: rdi+42, deref, stack-value, otherwise it should print the full expression, e.g.: DW_OP_breg1, DW_OP_constu 42, DW_OP_deref, DW_OP_stack_value.

Sounds good & consistent to me!

(And at some point soon, I'll upload a patch to make brief the default instead of verbose).

Excellent

With all that in mind - Signing off on this contingent on removing teh drop(3) (so printing DW_OP_* rather than OP_*). Considering all the other stuff for consideration before or shortly after this is committed, as you see fit, Reid.

llvm/include/llvm/DebugInfo/DWARF/DWARFExpression.h
100 ↗(On Diff #112973)

Naming the variable the same as the type is a bit awkward (given the need to then qualify the class name later ("class Op")) - maybe 'O' would do? The scope's short enough the name shouldn't be too confusing.

Or naming the type "Operation" might be helpful.

llvm/lib/DebugInfo/DWARF/DWARFExpression.cpp
106 ↗(On Diff #112622)

Then shouldn't the parameter be a one byte sized type & that ought to follow through to the source of the data?

228 ↗(On Diff #112973)

Reckon it's worth dropping 6? (OP_ is pretty redundant)

Though as-is, or with that suggestion, it's a bit inconsistent with our other DWARF enum printing (that keeps the DW_ prefix).

Maybe we should just drop the DW_ suffix everywhere? (I wonder about dropping DW_*_ everywhere - but maybe that'll only be done in 'brief' mode)

dblaikie accepted this revision.Aug 28 2017, 8:38 PM

Helps if I actually 'accept'...

This revision is now accepted and ready to land.Aug 28 2017, 8:38 PM

Awesome, I can't wait for this to land!

rnk added a comment.Aug 29 2017, 2:21 PM
  1. Should we try to avoid one-time dynamic initialization for the DW_OP description table

Would it perhaps make sense to initialize the table from the macros in Dwarf.def?

+1, worth considering.

  1. Should we remove the "OP_" prefix

I think we should implement two behaviors: When DumpOptions::Brief is set, we should print expressions with DW_OP_ stripped and registers in human-readable form, e.g.: rdi+42, deref, stack-value, otherwise it should print the full expression, e.g.: DW_OP_breg1, DW_OP_constu 42, DW_OP_deref, DW_OP_stack_value.

Sounds good & consistent to me!

(And at some point soon, I'll upload a patch to make brief the default instead of verbose).

Excellent

With all that in mind - Signing off on this contingent on removing teh drop(3) (so printing DW_OP_* rather than OP_*). Considering all the other stuff for consideration before or shortly after this is committed, as you see fit, Reid.

Got it. It's easy to go from OP_ to DW_OP_, but a lot harder to go from no prefix to DW_OP_ prefix...

rnk marked an inline comment as done.Aug 29 2017, 2:34 PM
rnk added inline comments.
llvm/lib/DebugInfo/DWARF/DWARFExpression.cpp
228 ↗(On Diff #112973)

To be clear, my plan is to emit DW_OP_ for now. We can shorten it by default or with a flag in a follow up.

This revision was automatically updated to reflect the committed changes.