This is an archive of the discontinued LLVM Phabricator instance.

DWARF: Implementation of v5 string offsets table (.debug_str_offsets[.dwo] section)/consumer
ClosedPublic

Authored by wolfgangp on May 2 2017, 5:55 PM.

Details

Summary

This review and the producer portion replaces https://reviews.llvm.org/D32618.

This is the proposed implementation of the DWARF5 string offsets table. Since the last iteration in https://reviews.llvm.org/D32618 support for Mach-O and COFF was added and review comments were addressed.

Diff Detail

Event Timeline

wolfgangp created this revision.May 2 2017, 5:55 PM
dblaikie added inline comments.May 4 2017, 11:51 AM
lib/DebugInfo/DWARF/DWARFContext.cpp
101–102

Is there a test case for this?

108–109

Is there a test case for this?

108–110

Maybe flip this around:

if (SegmentSize == 0xffff... ) {
  ...
} else if (SegmentSize > 0xfff...0) {
  ...
}

Though the early return is nice - just be nice if there was a way to avoid checking for the 0xff..ff case twice. At least on the first check perhaps it'd be better to write it as "Size > 0xff..f0 && Size != 0xff..ff" ? rather than < when it'll only check one valeu anyway?

111–112

Is there a test case for this?

119

Does this need to test if the SegmentSize is a multiple of the record size? otherwise the loop below might read a partial record, off the end/past SegmentSize?

119–120

Is there a test case for this?

130–132

Is there a general purpose "apply a reloc" function that could be used here?

1153–1156

Why is this being added now/in this change?

(how did things work without this change - what problem arose that motivated it?)

lib/DebugInfo/DWARF/DWARFFormValue.cpp
628–631

Rather than trying, failing, falling back - should this be implemented in a way to "do the right thing" from the start?

lib/DebugInfo/DWARF/DWARFTypeUnit.cpp
27–30 ↗(On Diff #97532)

Seems like there's a few things in this patch that might be sane to separate - like this DWARF64 support for type units.

lib/DebugInfo/DWARF/DWARFUnit.cpp
78

StringOffsetSectionBase is to support DWP files, I assume? Is that tested?

wolfgangp added inline comments.May 4 2017, 1:14 PM
lib/DebugInfo/DWARF/DWARFContext.cpp
101–102

Hmm, I guess not. This would be one of a handful cases of invalid input, so I will add cases for all of them.

108–109

Will add one.

111–112

Will add one.

119–120

Will add one.

130–132

Yes, there is one. I'm not sure why I didn't see it before. I'll work that in.

1153–1156

Adrian reminded me that I should also support Mach-O and COFF. This is part of the support for Mach-O. Since Mach-O limits section names to 16 bytes and some of the DWARF standard section names are longer than that (e.g. .debug_str_offsets) it seemed prudent to map them here to the expected standard names (and provide a general purpose target-dependent mapping for any future formats).
Otherwise I need some other way to recognize that the section with the name "__debug_str_offs" is supposed to represent the string offset section in a Mach-O object file.

lib/DebugInfo/DWARF/DWARFFormValue.cpp
628–631

I was just trying to fit the new functionality into the existing structure but I'll give it a shot.

lib/DebugInfo/DWARF/DWARFTypeUnit.cpp
27–30 ↗(On Diff #97532)

This is actually a bugfix. The previous code neglected to take the leading length field into account. This was no big deal since the remainder of the type unit DIE is always longer than 4 anyway. It only showed up in a hand-constructed test case where it wasn't.

In any case, I'll be happy to separate this out as an independent bug fix. Perhaps separating out DWARF64 support in general would be better?

lib/DebugInfo/DWARF/DWARFUnit.cpp
78

Ahem, no. I must confess I was only marginally aware of DWP. I have to give it another look.

dblaikie added inline comments.May 8 2017, 12:49 PM
lib/DebugInfo/DWARF/DWARFContext.cpp
1153–1156

Perhaps the mapping could/should be done after the stripping below? Then fewer cases would need to be handled? Only debug_str_offsets, in fact?

(which would help explain my uncertainty/confusion as to how this wasn't changing existing cases since it seems to be remapping all sorts of section names that weren't modified previously - only because they all collapse back to the same thing after the substr+find_first_not_of case below)

lib/DebugInfo/DWARF/DWARFTypeUnit.cpp
27–30 ↗(On Diff #97532)

Independent bug fixes should be separate - easier to review, make sure they're tested, etc.

As for separating DWARF64 support for str_offsets in general? Eh, if it looks nice separately & makes it clearer, OK - if it seems to work well together but still clear that it needs/has separate test coverage, that's OK too.

wolfgangp added inline comments.May 8 2017, 5:59 PM
lib/DebugInfo/DWARF/DWARFTypeUnit.cpp
27–30 ↗(On Diff #97532)

Independent bug fixes should be separate - easier to review, make sure they're tested, etc.

I split this off as https://reviews.llvm.org/D32987

  1. Added tests for invalid formatting of the string offsets section (hand-constructed DWARF).
  2. Fixed a bug found with handling of dwp files and added a hand-constructed test for it.
  3. Addressed review comments: a) rewrote validation of string offset table contribution headers b) moved mapping of debug section names to after stripping of leading _ or .
  4. Using the word 'contribution' instead of 'segment' when displaying the string offsets table.
wolfgangp marked 8 inline comments as done.May 24 2017, 12:45 PM
wolfgangp added inline comments.
lib/DebugInfo/DWARF/DWARFContext.cpp
119

Yes, thanks for catching that.

lib/DebugInfo/DWARF/DWARFFormValue.cpp
628–631

Looked at this for a while, but couldn't come up with a better way to write it. Do you have any suggestions?

dblaikie accepted this revision.May 29 2017, 4:11 PM

A few things to tidy up, but looks plausible.

lib/DebugInfo/DWARF/DWARFContext.cpp
134

Move the declaration to the first use:

uint64_t StringOffset = getRelocatedValue(...
138–142

Probably drop the {} from single line blocks - that tends to be how most of the LLVM code is written.

322–340

Might be worthsinking this all into the dumpStringOffsetsSection function? (potentially pulling out the current function contents as dumpDwarf5StringOffsetsSection or the like?)

test/DebugInfo/dwarfdump-str-offsets-invalid.test
1–9

Should these print error messages of some kind? I mean I realize most of dwarfdump is pretty relaxed about error handling, so perhaps it doesn't make much sense to add more just here - but worth a thought.

test/DebugInfo/dwarfdump-str-offsets.test
57–67

Reckon it'd be worth printing the strings that these offsets refer to here? (since the strings are indirected and printed in the debug_info (which has to go through this table and then on to the strings table) it seems weird to have them missing here, maybe?

This revision is now accepted and ready to land.May 29 2017, 4:11 PM
wolfgangp updated this revision to Diff 101241.Jun 2 2017, 11:14 AM
wolfgangp marked an inline comment as done.

The patch has been accepted but in order to address the last review comments some non-trivial changes were necessary, so it probably needs another look.

  1. error messages will now be emitted when we find invalid contributions to the string offsets table. One test has been added to address the case where we encounter a degenerate .debug_str_offsets section (without a v5 unit present).
  1. Per Dave's suggestion the routines dumping the string offsets table have been reorganized a bit, so now the main dump routine for DWARFContext looks cleaner. Dumping a DWARF v5 string offsets table keys off whether we find a unit with version 5 or greater. Otherwise we dump a monolithic table. That means that the *invalid* tests had to be enhanced a bit to contain at least a rudimentary v5 compile unit.
  1. Dumping the string offsets table now also displays the strings, so the respective test has been updated.
wolfgangp marked 5 inline comments as done.Jun 2 2017, 11:16 AM
wolfgangp added inline comments.
test/DebugInfo/dwarfdump-str-offsets.test
57–67

I agree. I wasn't sure if somebody who is exclusively looking at the string offsets table would really be interested but there is no reason not to have the strings there as well.

dblaikie accepted this revision.Jun 5 2017, 2:37 PM

Oops, forgot to say anything in my approval so it didn't send mail to the list.

Looks good - please submit. :)

This revision was automatically updated to reflect the committed changes.
wolfgangp marked an inline comment as done.