This is an archive of the discontinued LLVM Phabricator instance.

[llvm-dwarfdump] - Fix incorrect parsing of the DW_LLE_startx_length
ClosedPublic

Authored by grimar on Oct 23 2018, 5:48 AM.

Details

Summary

As was already mentioned in comments for D53364, DWARF 5 spec says about:

DW_LLE_startx_length: "This is a form of bounded location description that has two unsigned ULEB operands.
The first value is an address index (into the .debug_addr section) that indicates the beginning of the address range
over which the location is valid. The second value is the length of the range. ")

Currently, the length is parsed as U32, patch fixes the issue.

Diff Detail

Event Timeline

grimar created this revision.Oct 23 2018, 5:48 AM
grimar edited the summary of this revision. (Show Details)Oct 23 2018, 5:49 AM
grimar added a comment.EditedOct 23 2018, 8:03 AM

It turns out LLVM emits the U32 now:
https://github.com/llvm-mirror/llvm/blob/master/lib/CodeGen/AsmPrinter/DwarfDebug.cpp#L1972

I'll prepare the patch for it too. Seems we will want to land these at the same time then?

Or should we handle DW_LLE_startx_length for .debug_loc (DWARF 4 dwo case) and .debug_loclists (DWARF5) differently for compatibility (I guess no as it does not match the specification)?

aprantl accepted this revision.Oct 23 2018, 8:41 AM
This revision is now accepted and ready to land.Oct 23 2018, 8:41 AM

It turns out LLVM emits the U32 now:
https://github.com/llvm-mirror/llvm/blob/master/lib/CodeGen/AsmPrinter/DwarfDebug.cpp#L1972

I'll prepare the patch for it too. Seems we will want to land these at the same time then?

Or should we handle DW_LLE_startx_length for .debug_loc (DWARF 4 dwo case) and .debug_loclists (DWARF5) differently for compatibility (I guess no as it does not match the specification)?

Yep, looks like this changed between the pre-standard definition ( https://gcc.gnu.org/wiki/DebugFission - "A start/length entry contains one unsigned LEB128 number and a 4-byte unsigned value (as would be represented by the form code DW_FORM_const4u). The first number is an index into the .debug_addr section that selects the beginning offset, and the second number is the length of the range. ") and the standard one (http://dwarfstd.org/doc/DWARF5.pdf - "This is a form of bounded location description that has two unsigned ULEB operands. The first value is an address index (into the .debug_addr section) that indicates the beginning of the address range over which the location is valid. The second value is the length of the range.")

So it should change based on DWARF version, unfortunately.

So it should change based on DWARF version, unfortunately.

What happens when the llvm-dwarfdump with this patch reads the old format? Depending on the answer it might be useful to also add a --verify check for this. I'll leave this up to you.

So it should change based on DWARF version, unfortunately.

What happens when the llvm-dwarfdump with this patch reads the old format? Depending on the answer it might be useful to also add a --verify check for this. I'll leave this up to you.

That's what I mean by "should change based on DWARF version" - this patch, as-is, is insufficient (since it'd break parsing pre-standard split-DWARF).

Not sure what --verify could do about this - since the data isn't self-describing, bytes are bytes and these rules specify how to interpret those bytes. If that makes sense?

aprantl requested changes to this revision.Oct 23 2018, 11:13 AM

So it should change based on DWARF version, unfortunately.

What happens when the llvm-dwarfdump with this patch reads the old format? Depending on the answer it might be useful to also add a --verify check for this. I'll leave this up to you.

That's what I mean by "should change based on DWARF version" - this patch, as-is, is insufficient (since it'd break parsing pre-standard split-DWARF).

Agreed.

Not sure what --verify could do about this - since the data isn't self-describing, bytes are bytes and these rules specify how to interpret those bytes. If that makes sense?

You're right. This is an abbrev entry; just raw data.

This revision now requires changes to proceed.Oct 23 2018, 11:13 AM
grimar updated this revision to Diff 170867.Oct 24 2018, 6:21 AM
  • Parse DW_LLE_startx_length differently for v4/pre-v5 and v5 sections.
dblaikie accepted this revision.Oct 24 2018, 10:30 AM

Looks good, thanks!

This revision was not accepted when it landed; it landed in state Needs Review.Oct 25 2018, 3:59 AM
This revision was automatically updated to reflect the committed changes.