This is an archive of the discontinued LLVM Phabricator instance.

DWARF: Implementation of v5 string offsets table (.debug_str_offsets[.dwo] section)/consumer
ClosedPublic

Authored by wolfgangp on May 2 2017, 5:55 PM.

Details

Summary

This review and the producer portion replaces https://reviews.llvm.org/D32618.

This is the proposed implementation of the DWARF5 string offsets table. Since the last iteration in https://reviews.llvm.org/D32618 support for Mach-O and COFF was added and review comments were addressed.

Diff Detail

Repository
rL LLVM

Event Timeline

wolfgangp created this revision.May 2 2017, 5:55 PM
dblaikie added inline comments.May 4 2017, 11:51 AM
lib/DebugInfo/DWARF/DWARFContext.cpp
100–101 ↗(On Diff #97532)

Is there a test case for this?

107–108 ↗(On Diff #97532)

Is there a test case for this?

107–109 ↗(On Diff #97532)

Maybe flip this around:

if (SegmentSize == 0xffff... ) {
  ...
} else if (SegmentSize > 0xfff...0) {
  ...
}

Though the early return is nice - just be nice if there was a way to avoid checking for the 0xff..ff case twice. At least on the first check perhaps it'd be better to write it as "Size > 0xff..f0 && Size != 0xff..ff" ? rather than < when it'll only check one valeu anyway?

110–111 ↗(On Diff #97532)

Is there a test case for this?

118 ↗(On Diff #97532)

Does this need to test if the SegmentSize is a multiple of the record size? otherwise the loop below might read a partial record, off the end/past SegmentSize?

118–119 ↗(On Diff #97532)

Is there a test case for this?

129–131 ↗(On Diff #97532)

Is there a general purpose "apply a reloc" function that could be used here?

1047–1050 ↗(On Diff #97532)

Why is this being added now/in this change?

(how did things work without this change - what problem arose that motivated it?)

lib/DebugInfo/DWARF/DWARFFormValue.cpp
608–610 ↗(On Diff #97532)

Rather than trying, failing, falling back - should this be implemented in a way to "do the right thing" from the start?

lib/DebugInfo/DWARF/DWARFTypeUnit.cpp
27–30 ↗(On Diff #97532)

Seems like there's a few things in this patch that might be sane to separate - like this DWARF64 support for type units.

lib/DebugInfo/DWARF/DWARFUnit.cpp
81 ↗(On Diff #97532)

StringOffsetSectionBase is to support DWP files, I assume? Is that tested?

wolfgangp added inline comments.May 4 2017, 1:14 PM
lib/DebugInfo/DWARF/DWARFContext.cpp
100–101 ↗(On Diff #97532)

Hmm, I guess not. This would be one of a handful cases of invalid input, so I will add cases for all of them.

107–108 ↗(On Diff #97532)

Will add one.

110–111 ↗(On Diff #97532)

Will add one.

118–119 ↗(On Diff #97532)

Will add one.

129–131 ↗(On Diff #97532)

Yes, there is one. I'm not sure why I didn't see it before. I'll work that in.

1047–1050 ↗(On Diff #97532)

Adrian reminded me that I should also support Mach-O and COFF. This is part of the support for Mach-O. Since Mach-O limits section names to 16 bytes and some of the DWARF standard section names are longer than that (e.g. .debug_str_offsets) it seemed prudent to map them here to the expected standard names (and provide a general purpose target-dependent mapping for any future formats).
Otherwise I need some other way to recognize that the section with the name "__debug_str_offs" is supposed to represent the string offset section in a Mach-O object file.

lib/DebugInfo/DWARF/DWARFFormValue.cpp
608–610 ↗(On Diff #97532)

I was just trying to fit the new functionality into the existing structure but I'll give it a shot.

lib/DebugInfo/DWARF/DWARFTypeUnit.cpp
27–30 ↗(On Diff #97532)

This is actually a bugfix. The previous code neglected to take the leading length field into account. This was no big deal since the remainder of the type unit DIE is always longer than 4 anyway. It only showed up in a hand-constructed test case where it wasn't.

In any case, I'll be happy to separate this out as an independent bug fix. Perhaps separating out DWARF64 support in general would be better?

lib/DebugInfo/DWARF/DWARFUnit.cpp
81 ↗(On Diff #97532)

Ahem, no. I must confess I was only marginally aware of DWP. I have to give it another look.

dblaikie added inline comments.May 8 2017, 12:49 PM
lib/DebugInfo/DWARF/DWARFContext.cpp
1047–1050 ↗(On Diff #97532)

Perhaps the mapping could/should be done after the stripping below? Then fewer cases would need to be handled? Only debug_str_offsets, in fact?

(which would help explain my uncertainty/confusion as to how this wasn't changing existing cases since it seems to be remapping all sorts of section names that weren't modified previously - only because they all collapse back to the same thing after the substr+find_first_not_of case below)

lib/DebugInfo/DWARF/DWARFTypeUnit.cpp
27–30 ↗(On Diff #97532)

Independent bug fixes should be separate - easier to review, make sure they're tested, etc.

As for separating DWARF64 support for str_offsets in general? Eh, if it looks nice separately & makes it clearer, OK - if it seems to work well together but still clear that it needs/has separate test coverage, that's OK too.

wolfgangp added inline comments.May 8 2017, 5:59 PM
lib/DebugInfo/DWARF/DWARFTypeUnit.cpp
27–30 ↗(On Diff #97532)

Independent bug fixes should be separate - easier to review, make sure they're tested, etc.

I split this off as https://reviews.llvm.org/D32987

  1. Added tests for invalid formatting of the string offsets section (hand-constructed DWARF).
  2. Fixed a bug found with handling of dwp files and added a hand-constructed test for it.
  3. Addressed review comments: a) rewrote validation of string offset table contribution headers b) moved mapping of debug section names to after stripping of leading _ or .
  4. Using the word 'contribution' instead of 'segment' when displaying the string offsets table.
wolfgangp marked 8 inline comments as done.May 24 2017, 12:45 PM
wolfgangp added inline comments.
lib/DebugInfo/DWARF/DWARFContext.cpp
118 ↗(On Diff #97532)

Yes, thanks for catching that.

lib/DebugInfo/DWARF/DWARFFormValue.cpp
608–610 ↗(On Diff #97532)

Looked at this for a while, but couldn't come up with a better way to write it. Do you have any suggestions?

dblaikie accepted this revision.May 29 2017, 4:11 PM

A few things to tidy up, but looks plausible.

lib/DebugInfo/DWARF/DWARFContext.cpp
134 ↗(On Diff #100144)

Move the declaration to the first use:

uint64_t StringOffset = getRelocatedValue(...
138–142 ↗(On Diff #100144)

Probably drop the {} from single line blocks - that tends to be how most of the LLVM code is written.

322–340 ↗(On Diff #100144)

Might be worthsinking this all into the dumpStringOffsetsSection function? (potentially pulling out the current function contents as dumpDwarf5StringOffsetsSection or the like?)

test/DebugInfo/dwarfdump-str-offsets-invalid.test
1–9 ↗(On Diff #100144)

Should these print error messages of some kind? I mean I realize most of dwarfdump is pretty relaxed about error handling, so perhaps it doesn't make much sense to add more just here - but worth a thought.

test/DebugInfo/dwarfdump-str-offsets.test
57–67 ↗(On Diff #100144)

Reckon it'd be worth printing the strings that these offsets refer to here? (since the strings are indirected and printed in the debug_info (which has to go through this table and then on to the strings table) it seems weird to have them missing here, maybe?

This revision is now accepted and ready to land.May 29 2017, 4:11 PM
wolfgangp updated this revision to Diff 101241.Jun 2 2017, 11:14 AM
wolfgangp marked an inline comment as done.

The patch has been accepted but in order to address the last review comments some non-trivial changes were necessary, so it probably needs another look.

  1. error messages will now be emitted when we find invalid contributions to the string offsets table. One test has been added to address the case where we encounter a degenerate .debug_str_offsets section (without a v5 unit present).
  1. Per Dave's suggestion the routines dumping the string offsets table have been reorganized a bit, so now the main dump routine for DWARFContext looks cleaner. Dumping a DWARF v5 string offsets table keys off whether we find a unit with version 5 or greater. Otherwise we dump a monolithic table. That means that the *invalid* tests had to be enhanced a bit to contain at least a rudimentary v5 compile unit.
  1. Dumping the string offsets table now also displays the strings, so the respective test has been updated.
wolfgangp marked 5 inline comments as done.Jun 2 2017, 11:16 AM
wolfgangp added inline comments.
test/DebugInfo/dwarfdump-str-offsets.test
57–67 ↗(On Diff #100144)

I agree. I wasn't sure if somebody who is exclusively looking at the string offsets table would really be interested but there is no reason not to have the strings there as well.

dblaikie accepted this revision.Jun 5 2017, 2:37 PM

Oops, forgot to say anything in my approval so it didn't send mail to the list.

Looks good - please submit. :)

This revision was automatically updated to reflect the committed changes.
wolfgangp marked an inline comment as done.