This is an archive of the discontinued LLVM Phabricator instance.

[dwarfdump] Pretty print location expressions and location lists
AbandonedPublic

Authored by aprantl on Dec 23 2014, 10:53 AM.

Details

Summary

With that change the locations exrpessions are pretty printed inline in the
DIE tree. The output looks like this for debug_loc entries:

DW_AT_location [DW_FORM_data4]        (0x00000000
   0x0000000000000001 - 0x000000000000000b: OP_consts +3
   0x000000000000000b - 0x0000000000000012: OP_consts +7
   0x0000000000000012 - 0x000000000000001b: OP_reg0 RAX, OP_piece 0x00000004
   0x000000000000001b - 0x0000000000000024: OP_breg5 RDI+0)

And like this for debug_loc.dwo entries:

DW_AT_location [DW_FORM_sec_offset]   (0x00000000
  Addr idx 2 (w/ length 190): OP_consts +0, OP_stack_value
  Addr idx 3 (w/ length 23): OP_reg0 RAX, OP_piece 0x00000004)

Simple locations wihtout ranges are sumped inlie to:

DW_AT_location [DW_FORM_block1]       (OP_reg4 RSI, OP_piece 0x00000004, OP_bit_piece 0x00000020 0x00000000)

The debug_loc(.dwo) dumping in changed accordingly to factor the code.

I've been sitting on this for much too long, and I'm not totally satisfied
with the code. For example, the detection of wether to use DWO or standard
lcoations in DebugInfoEntryMinimal doesn't look right. Ideas welcome (one
idea would be to have virtual DebugLoc objects and to store that in the
DWARFUnitSection, but it feels slightly over-engineered for the purpose).
The patch also isn't ready to apply as-is. It requires a few preliminary
patches like landing D6243 and adding a MCRegisterInfo in the DWARFContext,
but I've left that out here in favor of the dumping related changes.

Happy Christmas!

Diff Detail

Event Timeline

friss updated this revision to Diff 17601.Dec 23 2014, 10:53 AM
friss retitled this revision from to [dwarfdump] Pretty print location expressions and location lists.
friss added reviewers: dblaikie, samsonov.
friss updated this object.
friss added a subscriber: Unknown Object (MLST).
dblaikie edited edge metadata.Dec 29 2014, 3:04 PM

This'd be nice to have, certainly - but the patch is a bit too big for me to swap in right now. If you're inclined to press on with this, it might be easier to review in smaller pieces (perhaps by skipping the DWO case initially? not sure - maybe there's other ways to break it down)

friss added a comment.Jan 5 2015, 10:21 AM

I could commit the DWARFDebugLoc refactoring (without the pretty printing) separately, but this is about all that can be split out. This patch really has 4 parts that are reviewable separately (but that don't really make sense as separate):

  • The refactoring in DWARFDebugLoc.{cpp/h}: This I could commit separately leaving the new pretty-printing out.
  • The new DWARFExpression.cpp/h}: The bulk of the patch logic. Pretty self contained it interacts with the outside world through just 1 function. Can be reviewed separately.
  • The integration into the DIE/DebugLoc dumpers.
  • The tests updates.

Leaving out the DWO case would spare maybe 10 lines, I don't think this will really make it more digestible for you...

dblaikie added inline comments.Jan 17 2015, 8:17 PM
lib/DebugInfo/DWARFContext.h
134 ↗(On Diff #17601)

Where does this case come up? It looks like the calls to getCompileUnitAtIndex are protected by a getNumCompileUnits check first, no?

lib/DebugInfo/DWARFDebugLoc.cpp
143–146

This would leave an extra element in the Locations container, wouldn't it? (Could be avoided with the return-by-value I was suggesting:

if (Optional<LocationList> o = parseOneLocationList(...))
  Locations.push_back(std::move(*o));
else
  return;

or similar)

151

leading newline seems a surprising contract - but maybe that's convention here, I haven't looked.

lib/DebugInfo/DWARFDebugLoc.h
63 ↗(On Diff #17601)

Would it make sense for this to return the LocationList, rather than use an out parameter?

90 ↗(On Diff #17601)

Possibly return Optional<LocationList> here

lib/DebugInfo/DWARFExpression.cpp
69

Not sure if this needs a class, rather than just a global const vector, and a function to construct and return it for initialization.

243

If this is meant to be resilient to failure/corrupted files, you'd probably need this not to be an assertion. (again, not sure - but I assume if you guys want to ship this it should be pretty robust & corrupt files shouldn't just assert fail/UB (& should be tested))

244

If we're going to skip the DW_ prefix here, should we skip it elsewhere (tags, attributes, and forms)?

256

Breaking out this switch into a function might help readability

301

Favor llvm_unreachable over assert(0)

test/DebugInfo/X86/dbg-value-const-byref.ll
25

The prior version of this test didn't hardcode the address offsets - it'd be good to preserve that to make the test resilient to minor changes (like instruction selection, etc).

test/DebugInfo/X86/debug-loc-offset.ll
50

Not sure I understand the comment

friss added a comment.Jan 19 2015, 8:55 AM

Thanks for taking the time to go though this! I'll respin within a couple of days. Some comments inline:

lib/DebugInfo/DWARFContext.h
134 ↗(On Diff #17601)

Hmmm. I remember asking me the same question when I skimmed over the patch before I sent it out so I must left it intentionally... though I can;t find a good reason right now. Maybe I added the guards afterwards, dunno.

lib/DebugInfo/DWARFDebugLoc.cpp
143–146

There would be no extra element in standard cases. But yes, if parseOneLocation errors out, you'd get an incomplete location in the list. (The normal exit of the loop should always be handled by the isValidOffset() condition).

151

I agree. Only motivation is: putting the newline at the end requires a conditional, because we don't want the last line to have a newline.

lib/DebugInfo/DWARFExpression.cpp
243

This one is one purpose. If we reach that point if the code, it means the Op has a description in the table, so it really should have a name in the support routine. No user input should be able to break that.

244

I did it here and not at the other spots, because the DW_OPs might be multiple on a line. I have no strong feeling about stripping the other spots too.

dblaikie added inline comments.Jan 19 2015, 10:00 AM
lib/DebugInfo/DWARFContext.h
134 ↗(On Diff #17601)

Fair enough - maybe just switch to an assert for now, then.

lib/DebugInfo/DWARFDebugLoc.cpp
143–146

Ah, fair - well, either way then, depending on how pedantic about bad inputs you want to be. (in some ways having the weird extra empty element might at least tell a user there's something weird, compared to silently skipping it)

151

Ah, fair enough

lib/DebugInfo/DWARFExpression.cpp
243

Makes sense.

Oh, btw - should we just put a pointer to the name in the ExpressionOps table?

friss updated this revision to Diff 18823.Jan 27 2015, 9:58 AM
friss edited edge metadata.

I reworked it a bit more than just addressing the review comments. Instead of just
offering a pretty-printing function, expose a real class that allows clients to
walk over the expression operations using an iterator interface. The iterator is
a bit strange because of its lazy extraction semantics and the error handling that
goes with it, but I think it looks quite usable. The pretty-printing is then
implemented over that iterator interface.

friss updated this revision to Diff 18825.Jan 27 2015, 10:02 AM

Try to teach arcanist to leave the straightforward preparation patches out of the review...

dblaikie added inline comments.Feb 1 2015, 5:51 PM
include/llvm/DebugInfo/DWARFDebugLoc.h
80

12?

include/llvm/DebugInfo/DWARFExpression.h
109

operator overloads that can be non-members generally should be (to allow symmetric conversions on LHS and RHS) - could define it inline as a friend if that's useful.

lib/DebugInfo/DWARFDebugLoc.cpp
34

what's the boolean? ('false')

(could use a comment, maybe, or an enum, etc)

test/CodeGen/X86/2011-01-24-DbgValue-Before-Use.ll
12

Maybe CHECK-NOT: DW_{{TAG|AT}} ? I assume this was here so we didn't skip any extra attributes on this TAG.

test/DebugInfo/X86/debug-loc-offset.ll
50

Not sure I follow why the low_pc offset would be problematic - it'll always be zero in a single-CU file, right?

friss added a comment.Feb 4 2015, 10:13 AM

Thanks. I'm testing an update addressing your feedback. Some comments bellow:

include/llvm/DebugInfo/DWARFExpression.h
109

Yeah, but the iterator_facade that I'm using requires a member function to deriver != from ==. I added a couple of non-members operators that forward to the member ones.

lib/DebugInfo/DWARFDebugLoc.cpp
34

It decides wether to use the standard or the EH register mapping. We never use the EH mapping for now, but I thought I'd thread it through anyway. I've changed that to a enum with a default value.

test/CodeGen/X86/2011-01-24-DbgValue-Before-Use.ll
12

I changed that to match the old test, although I'm not sure their is any value in checking the ordering of the attributes within a TAG.

test/DebugInfo/X86/debug-loc-offset.ll
50

This is the whole point of this test. It's a multi-cu test that verifies that the location entries are stored relatively to the low_pc of the unit. The test is that the location's base address should be 0. When it is displayed inline the offset is applied - which is a good thing - but it's not what we want to test. So I defer the test to the raw dump of the debug_lc section bellow.
The comment shouldn't read 'stored here' though, 'deployed here' would be more accurate.

friss updated this revision to Diff 19340.Feb 4 2015, 10:56 AM
  • Rebased on latest master
  • Addressed review comments

Looks pretty reasonable.

Alexey - want to have a quick glance over it, since the dumping code is more your domain these days?

include/llvm/DebugInfo/DWARF/DWARFExpression.h
134 ↗(On Diff #19340)

This would need to be defined with the "inline" keyword or it'll create ODR violations/duplicate symbol linker errors, no?

I'd just put the definition inline in the friend declaration:

friend bool op==(X, Y) {
  ...
}
include/llvm/DebugInfo/DWARFExpression.h
100

would be nice to implement ++ the usual way, returning a reference to the object, rather than void (I forget which one is pre or post increment - anyway, whichever way it's meant to work, either a copy of the unincremented value, or a reference to the incremented value)

lib/DebugInfo/DWARF/DWARFDebugInfoEntry.cpp
130 ↗(On Diff #19340)

Could possibly abstract over these two differences using a function template helper?

Or maybe it'd suffice to put the if (LL)/else outside the if conditios so it can be shared (& could add an else to the main if (sec.data.size()) that returns early to make that work? Dunno how that'd look)

lib/DebugInfo/DWARF/DWARFDebugLoc.cpp
31 ↗(On Diff #19340)

Maybe we should decide whether binary data bytes should be signed or unsigned. Either change DataExtractor to deal with ArrayRef<unsigned char> or change DwarfDebugLoc::Entry::Loc to be a SmallVector<char, N> instead.

lib/DebugInfo/DWARF/DWARFExpression.cpp
32 ↗(On Diff #19340)

This might be more succinctly to define as some kind of array literal?

128 ↗(On Diff #19340)

Perhaps sink this switch into a separate function to avoid the Operands[Operand] repetition in every case & improve readability a little by splitting things up?

samsonov added inline comments.Jun 26 2015, 3:18 PM
include/llvm/DebugInfo/DWARF/DWARFDebugLoc.h
47 ↗(On Diff #19340)

Just move private members decl/def to the top of the class.

62 ↗(On Diff #19340)

Same here - DataExtractor should now the address size.

65 ↗(On Diff #19340)

I think your DataExtractor should know AddressSize instead.

84 ↗(On Diff #19340)

Ditto

include/llvm/DebugInfo/DWARF/DWARFExpression.h
10 ↗(On Diff #19340)

Please fix header guard.

82 ↗(On Diff #19340)

Do you actually use these methods anywhere?

107 ↗(On Diff #19340)

Why do you set Op.Error in iterator, but not in Op::Extract?

include/llvm/DebugInfo/DWARFExpression.h
2

This file should be removed in favor of include/llvm/DebugInfo/DWARF/DWARFExpression.h, right?

lib/DebugInfo/DWARF/DWARFContext.cpp
129 ↗(On Diff #19340)

This is slightly weird - what if there are no compile units, but non-empty .debug_loc_dwo?

lib/DebugInfo/DWARF/DWARFDebugInfoEntry.cpp
119 ↗(On Diff #19340)

extra semicolon

130 ↗(On Diff #19340)

I actually think it's fine to leave this as is to keep the code straightforward.
We may shorten it, though

if (auto LL = DebugLoc.parseOneLocationList) {
  LL->dump(OS, U, Indent);
} else {
  OS << "error extracting location list.";
}
137 ↗(On Diff #19340)

do you need to set indent here?

lib/DebugInfo/DWARF/DWARFDebugLoc.cpp
64 ↗(On Diff #19340)

getAddress() ?

91 ↗(On Diff #19340)

Consider using

E.Loc.resize(Bytes);
if (!Data.getU8(Offset, E.Loc.data(), Bytes)) {
  llvm::errs() << "...."
}

same for parsing U16 above

uint16_t Bytes;
if (!Data.getU16(Offset, &Bytes, 1)) {
  llvm::errs() << "...";
}
101 ↗(On Diff #19340)

isValidOffsetForAddress

lib/DebugInfo/DWARF/DWARFExpression.cpp
110 ↗(On Diff #19340)

This is a copy of method from DWARFFormValue.cpp? Probably the code should be shared, so that there's one place to fix when we support DWARF64.

Fred / David: what is missing for this to land in trunk?
I'm happy to take this over and finish it up if you want to abandon it.

friss abandoned this revision.Dec 5 2016, 2:17 PM

Looks like I dropped the ball on this

aprantl commandeered this revision.Dec 5 2016, 2:18 PM
aprantl added a reviewer: friss.

Stealing this review from Fred.

rnk added a reviewer: rnk.Aug 24 2017, 3:15 PM
rnk edited edge metadata.Aug 24 2017, 3:35 PM

There are 72 remaining test failures: https://reviews.llvm.org/P8028

The printing isn't pretty enough yet, IMO, but I spent a few hours rebasing this so I figured I would post the patch for people to try it out.

rnk added a comment.Aug 24 2017, 3:36 PM

Of course, I meant to send that to the D37123 review, which has a rebased version of this patch. It has some layering hacks, but builds for me locally. PTAL.