This is an archive of the discontinued LLVM Phabricator instance.

[DebugInfo] Add error checking around data extraction in DWARFAbbreviationDeclaration::extract
ClosedPublic

Authored by bulbazord on May 30 2023, 2:40 PM.

Details

Summary

In trying to hoist errors further up this callstack, I discovered that
if the data in the debug_abbrev section is invalid entirely, the code
that parses the debug_abbrev section may do strange or unpredictable
things. The underlying issue is that DataExtractor will return a value
of 0 when it encounters an error in extracting a LEB128 value. It's thus
difficult to determine if there was an error just by looking at the
return value. This patch aims to bail at the first sight of an error in
the debug_abbrev parsing code.

Diff Detail

Event Timeline

bulbazord created this revision.May 30 2023, 2:40 PM
Herald added a project: Restricted Project. · View Herald TranscriptMay 30 2023, 2:40 PM
Herald added a subscriber: hiraditya. · View Herald Transcript
bulbazord requested review of this revision.May 30 2023, 2:40 PM
Herald added a project: Restricted Project. · View Herald TranscriptMay 30 2023, 2:40 PM

There are five new places that the function you've changed can bail out with an error, but you've only got one test. I think it makes sense to test the other new places too?

llvm/unittests/DebugInfo/DWARF/DWARFDebugAbbrevTest.cpp
47

Might be worth adding an assert that Size is a multiple of 4?

125–126

Nit: it seems unnecessary to have this over two lines?

Is it also worth checking that Offset is some specific value?

fdeazeve added inline comments.May 31 2023, 4:19 AM
llvm/unittests/DebugInfo/DWARF/DWARFDebugAbbrevTest.cpp
45

Small nit but it would be helpful for readers if we improve the naming here a bit. There is nothing inherently "Bad" or inherently "Abbreviation" in the function, this is a form of write_n algorithm.

Is it bad because we always write an odd number of elements? Or is it bad because we never have the proper zero terminators? Or because the contents of Data, when splat Size times, would fail to be decoded as an ULEB? I think the answer to this is related to @jhenderson's suggestion for different tests

In all those cases, my suggestion would be to rename this to writeValueNTimes and then on the callee side we name the data/size to BadData and BadSize and maybe add a comment on why it's bad

bulbazord updated this revision to Diff 527239.May 31 2023, 4:55 PM
  • Add more rigorous testing. Specifically, I'm testing every possible place where we we can bail early from DWARFAbbreviationDeclaration::extract.
  • Replace the writeBadAbbreviationDeclarationData with more specific functions to handle both malformed input and input that is too large (both kinds of LEB128 extraction errors).
fdeazeve accepted this revision.Jun 1 2023, 3:32 AM

The test looks great!

This revision is now accepted and ready to land.Jun 1 2023, 3:32 AM

Looks like some lld-macho tests are failing... I'm looking at them and the debug_abbrev data in them looks malformed. I'll update those tests here as well.

bulbazord updated this revision to Diff 527627.Jun 1 2023, 2:40 PM

Update lld-macho tests

Herald added a project: Restricted Project. · View Herald TranscriptJun 1 2023, 2:40 PM
Herald added a reviewer: Restricted Project. · View Herald Transcript
bulbazord added a subscriber: int3.Jun 1 2023, 2:43 PM

@int3 Hey can you take a look at the lld tests to make sure I didn't make a mistake? The debug_abbrev sections were malformed, I've manually added the necessary terminating 0s into the tests. They do pass on my machine after the change.

bulbazord updated this revision to Diff 528569.Jun 5 2023, 1:54 PM

Updating for pre-submission checks

int3 added a comment.Jun 5 2023, 4:52 PM

Seems reasonable, I probably deleted those bytes when trying to make a 'minimal' test case

int3 accepted this revision.Jun 5 2023, 4:52 PM

Re-applied as 94935c0d9ac89263dd0d7ce6bfb6f700c0f10fa2