This is an archive of the discontinued LLVM Phabricator instance.

[DWARFDebugLine] Use truncating data extractors for prologue parsing
ClosedPublic

Authored by labath on Apr 6 2020, 8:02 AM.

Details

Summary

This makes the code easier to reason about, as it will behave the same
way regardless of whether there is any more data coming after the
presumed end of the prologue.

Diff Detail

Event Timeline

labath created this revision.Apr 6 2020, 8:02 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 6 2020, 8:03 AM
probinson added inline comments.Apr 6 2020, 8:26 AM
llvm/include/llvm/DebugInfo/DWARF/DWARFDebugLine.h
132

The truncating constructor takes a const DWARFDataExtractor& so I don't understand why this parameter had to change.

Some nice cleanup indeed - thanks! Please wait for @jhenderson to also approve, since he's been doing more work in this area. (never sure whether to approve or not in this case - not sure if the official "approval" might then remove it from other people's review queue, etc)

MaskRay added inline comments.Apr 6 2020, 12:10 PM
llvm/unittests/DebugInfo/DWARF/DWARFDebugLineTest.cpp
457

At a first glance, 2 : 8 looks a bit magical. This may deserve a comment.

probinson added inline comments.Apr 6 2020, 1:33 PM
llvm/unittests/DebugInfo/DWARF/DWARFDebugLineTest.cpp
458

The version check should be < rather than != here.

465

And here, Version < 5 please.

Some nice cleanup indeed - thanks! Please wait for @jhenderson to also approve, since he's been doing more work in this area. (never sure whether to approve or not in this case - not sure if the official "approval" might then remove it from other people's review queue, etc)

FWIW, in my case, I just use my emails as my review queue, so I don't mind others approving too.

llvm/test/tools/llvm-dwarfdump/X86/Inputs/debug_line_malformed.s
195

I wonder if it's worth a comment at the start here saying something to explain why the formatting of the following bytes is weird. Perhaps something like "The following bytes would be read as part of the prologue header, were data extraction not stopped once the data end is reached, hence they are formatted like a header".

Something similar applies for the other changes in this file.

324

I think this comment is missing a word somewhere. Probably "truncated"? Also missing a trailing full stop. Also "md5" -> "MD5".

llvm/test/tools/llvm-dwarfdump/X86/debug_line_invalid.test
115

This should probably show that the last opcode within range is read successfully in addition to showing the past-the-end one is recorded as zero.

143

Missing trailing full stop.

This may not have been clear from the test comments before, but this and the previous case were supposed to work in tandem, because the invalid MD5 form message was reported in both cases, but not the "not at end" message IIRC.

I feel like the input for this case probably should be updated to not have a malformed MD5 form. Is there any interaction any more to do with continuing to parse that these 2 cases now don't need to cover?

206

How straightforward would it be to provide the offset as extra context here, like we do with "unexpected end of data"?

llvm/unittests/DebugInfo/DWARF/DWARFDebugLineTest.cpp
470

I'm not sure where this error is coming from (I don't recall seeing it before), but it's got a spurious trailing full stop, which is inconsistent with the other error messages. It might want fixing separately I guess.

478–482

What happened to this error message?

labath updated this revision to Diff 267607.Jun 1 2020, 6:54 AM
labath marked 5 inline comments as done.
  • rebase on latest master
  • this does not attempt to address any comments yet. I'll do that separately.
labath updated this revision to Diff 267618.Jun 1 2020, 7:50 AM
labath marked 6 inline comments as done.
  • address reviewer comments (I am very sorry about the long delay)
llvm/include/llvm/DebugInfo/DWARF/DWARFDebugLine.h
132

It wasn't strictly necessary, but with at const & argument, I'd have to create new object to hold the truncated extractor -- this way I can reassign it. It seemed like a good idea to avoid having two data extractors floating around...

llvm/test/tools/llvm-dwarfdump/X86/Inputs/debug_line_malformed.s
195

Actually, I was thinking of just deleting this stuff as it is no longer interesting to test. I wanted to do it as a separate commit, as that will change the offsets of everything, making it hard to see what changed. WDYT?

llvm/test/tools/llvm-dwarfdump/X86/debug_line_invalid.test
143

I've changed this test to use a valid form. I don't think more test are needed -- this is pretty generic code, and form value extraction is tested in DWARFFormValueTest.cpp.

206

That was done with D80799.

llvm/unittests/DebugInfo/DWARF/DWARFDebugLineTest.cpp
470

there are a couple of messages like this in parseV5DirFileTables so I'll fix those separately. The reason you haven't seen it before was that before D77308 DWARFFormValue::extractValue was always "successful".

478–482

It's just gone because the error has changed. Now it is not possible for parsing to run off the end of the prologue, so the failure mode is different.

jhenderson added inline comments.Jun 2 2020, 1:19 AM
llvm/lib/DebugInfo/DWARF/DWARFDebugLine.cpp
383

We probably also want to truncate it based on the unit length first, before reading the version etc. I've actually got plans for a local change related to that, but as you're working in this area, I'm happy for you to do it as part of this or a subsequent change, if you want, rather than me doing it.

440–442

A number of our other error messages explicitly use "at offset 0x..." for the offset part of the message. Perhaps we should update this message whilst you're here?

Alternatively, perhaps we should remove "offset" in all other locations where it is already present (as a separate change). I think either can be argued for - the former for complete clarity, the latter for conciseness.

447

We probably still want this line - if you don't update it, and the prologue finishes early, you'll get the recoverable error, but then parsing will continue from the point it got to rather than the start of the table body, as claimed by the length. In general, I think we should prefer following the line table's stated lengths where possible.

llvm/test/tools/llvm-dwarfdump/X86/Inputs/debug_line_malformed.s
195

I'm happy to remove any unnecessary additional test cases, where they provide no new value, but we need to be slightly careful we don't lose coverage of parsing the file/directory name tables. If a test case doesn't cover a unique code path, then I'm good for it to be deleted, and as a separate patch is fine.

labath updated this revision to Diff 268145.Jun 3 2020, 4:50 AM
labath marked 11 inline comments as done.
  • truncate the extractor based on the unit length first (and fix one test which had a bogus unit length)
  • add "offset"s
llvm/lib/DebugInfo/DWARF/DWARFDebugLine.cpp
383

I can do that now.

440–442

I've added the offsets. I don't have any opinions on whether should include the "offset" part or not.

447

The reason it's not needed is because there's a *OffsetPtr = DebugLineOffset + Prologue.getLength(); down on line 738.

llvm/test/tools/llvm-dwarfdump/X86/Inputs/debug_line_malformed.s
195

I'm not talking about deleting entire test cases -- just the bits of line tables that are prologue extensions. They were needed/interesting to test when the prologue parsing would not respect prologue boundaries, but now I don't think they serve any purpose and I think we can just put a simple DW_LNE_end_sequence as the line table contents.

Looks good to me, but I'll hold off approval so you can touch up the tests.

llvm/test/tools/llvm-dwarfdump/X86/Inputs/debug_line_malformed.s
195

The "standard" contents are a set address and end sequence for reasons I forget these days, so we should probably keep just that much to match the other cases. Beyond that, I'm happy for things to be simplified where they don't add value.

labath updated this revision to Diff 268174.Jun 3 2020, 6:40 AM
  • remove the extraneous opcodes and adjust offsets
  • adjust tests for the "offset" addition in the previous revision
jhenderson accepted this revision.Jun 3 2020, 7:10 AM

LGTM, with one fix to the error message.

llvm/unittests/DebugInfo/DWARF/DWARFDebugLineTest.cpp
434

prologue -> prologue end

This revision is now accepted and ready to land.Jun 3 2020, 7:10 AM
This revision was automatically updated to reflect the committed changes.
jhenderson added inline comments.Jun 11 2020, 1:59 AM
llvm/lib/DebugInfo/DWARF/DWARFDebugLine.cpp
440

Oops, just spotted a mistake here: "reaching the prologue at" -> "reaching the prologue end at"

Are you okay to fix?

labath marked 2 inline comments as done.Jun 11 2020, 4:06 AM
labath added inline comments.
llvm/lib/DebugInfo/DWARF/DWARFDebugLine.cpp
440

Done in 0b781db9087. Thanks for catching that.

llvm/test/tools/llvm-dwarfdump/X86/debug_line_invalid.test