This is an archive of the discontinued LLVM Phabricator instance.

[llvm-dwarfdump] - Fix array out of bounds access crash.
ClosedPublic

Authored by grimar on Oct 23 2017, 9:04 AM.

Details

Summary

I faced random crash in llvm-dwarfdump, which was randomly reproducable.
It happens because llvm-dwarfdump can access array out of bounds when DWARF
parsers tries to get children DIEs which are absent because of corrupted .debug_data.

Problem is in a following method:

DWARFDie getFirstChild() const {
  if (isValid() && Die->hasChildren())
    return DWARFDie(U, Die + 1);
  return DWARFDie();
}

Here new DWARFDie is created, but there is no checks that Die + 1
is a valid memory, because Die is a simple pointer. Though
it is possible for Die + 1 to point on garbage data and testcase provided
shows that.

I suggest to wrap Die into ArrayRef, what allows to do all the necessary safety checks.

Diff Detail

Repository
rL LLVM

Event Timeline

grimar created this revision.Oct 23 2017, 9:04 AM
aprantl added inline comments.Oct 23 2017, 9:24 AM
include/llvm/DebugInfo/DWARF/DWARFDie.h
45 ↗(On Diff #119859)

Can you add a comment explaining why the Die member is an array and what the bounds of the array are representing? I assume that the last member of the array is the last children of the DIE?

tools/llvm-dwarfdump/llvm-dwarfdump.cpp
284 ↗(On Diff #119859)

why does this break the iterator?

dblaikie added inline comments.Oct 23 2017, 10:17 AM
include/llvm/DebugInfo/DWARF/DWARFDie.h
55 ↗(On Diff #119859)

This is probably better written as:

return &Die[0]

Though I guess this returns a pointer because it can be/is called on invalid DWARFDies? (in which case the above code would be invalid - and maybe we should assert that this isn't called on invalid DWARFDies? & maybe have it return a reference instead of a pointer if that's possible)

115 ↗(On Diff #119859)

This is probably incorrect - in the sense that it'll produce a DWARFDie that has a longer array that contains more than its own children.

Imagine this tree of DIEs:

A
  B
  C

If you ask A for its first child, you'd get a DWARFDie containing {B, C} and then if you asked B for its first child, you could end up with C - even though C is a sibling of B, not a child.

I guess that bug already existed, but still...

lib/DebugInfo/DWARF/DWARFDie.cpp
504 ↗(On Diff #119859)

Probably worth being consistent about using either Die.front() or Die[0] across this change. I'm not too fussed about which, really. I guess I lean slightly towards front(), though accept it's more verbose..

lib/DebugInfo/DWARF/DWARFUnit.cpp
428 ↗(On Diff #119859)

I'd probably write DieArray.data() + I as &DieArray[I] - assuming I is in bounds?

Thanks everyone for comments ! I plan to address them tomorrow, few quick answers are below.

include/llvm/DebugInfo/DWARF/DWARFDie.h
45 ↗(On Diff #119859)

No, last member is the las DIE of unit, just like in original code.
I'll revisit this place. May be we can just have array of children here.

115 ↗(On Diff #119859)

you'd get a DWARFDie containing {B, C} and then if you asked B for its first child, you could end up with C

It should not happen. Because Die[0].hasChildren() takes information about children from abbreviation table,
so for that case it will not find DW_CHILDREN_yes there and will return empty DWARFDie as expected.

I'll try to improve this place though.

tools/llvm-dwarfdump/llvm-dwarfdump.cpp
284 ↗(On Diff #119859)

It does not break it.
Problem is that CU->dies() iterates over array of DWARFDebugInfoEntry.
Previous code just used pointer as argument:

DWARFDie Die = {CU.get(), &Entry};

But now I need to pass ArrayRef of them to DWARFDie constructor, I need to find array size
or take DWARFDie that is already exist.

grimar updated this revision to Diff 120020.Oct 24 2017, 2:29 AM
  • Reimplemented in a much simpler and less error-prone way.
aprantl accepted this revision.Oct 24 2017, 8:47 AM

Out of curiosity: does llvm-dwarfdump --verify fail on your testcase?

This revision is now accepted and ready to land.Oct 24 2017, 8:47 AM

Out of curiosity: does llvm-dwarfdump --verify fail on your testcase?

Yes, it complains:

Verifying .debug_abbrev...
Verifying .debug_info Unit Header Chain...
error: Units[1] - start offset: 0x00000020
note: The length for this unit is too large for the .debug_info provided.
note: The 16 bit unit header version is not valid.
note: The address size is unsupported.
Verifying .debug_info references...
Errors detected.
warning: DWARF compile unit extends beyond its bounds cu 0x00000000 at 0x0000002
1'
warning: DWARF compile unit extends beyond its bounds cu 0x00000000 at 0x0000002
1'

Out of curiosity: does llvm-dwarfdump --verify fail on your testcase?

Yes, it complains:

Awesome!

Verifying .debug_abbrev...
Verifying .debug_info Unit Header Chain...
error: Units[1] - start offset: 0x00000020
note: The length for this unit is too large for the .debug_info provided.
note: The 16 bit unit header version is not valid.
note: The address size is unsupported.
Verifying .debug_info references...
Errors detected.
warning: DWARF compile unit extends beyond its bounds cu 0x00000000 at 0x0000002
1'
warning: DWARF compile unit extends beyond its bounds cu 0x00000000 at 0x0000002
1'

Uh.. that means we either have a bug in our DWARF backend or in the Verifier (probably the Verifier). That also needs to be fixed at some point :-)

This revision was automatically updated to reflect the committed changes.