This is an archive of the discontinued LLVM Phabricator instance.

Minidump: Add support for reading/writing strings
ClosedPublic

Authored by labath on Mar 25 2019, 9:10 AM.

Details

Summary

Strings in minidump files are stored as a 32-bit length field, giving
the length of the string in *bytes*, which is followed by the
appropriate number of UTF16 code units. The string is also supposed to
be null-terminated, and the null-terminator is not a part of the length
field. This patch:

  • adds support for reading these strings out of the minidump file (this implementation does not depend on proper null-termination)
  • adds support for writing them to a minidump file
  • using the previous two pieces implements proper (de)serialization of the CSDVersion field of the SystemInfo stream. Previously, this was only read/written as hex, and no attempt was made to access the referenced string -- now this string is read and written correctly.

The changes are tested via yaml2obj|obj2yaml round-trip as well as a
unit test which checks the corner cases of the string deserialization
logic.

Diff Detail

Event Timeline

labath created this revision.Mar 25 2019, 9:10 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 25 2019, 9:10 AM

Looks fine to me, but probably need a LLVM specific person for the final ok

Looks fine to me, but probably need a LLVM specific person for the final ok

Maybe @zturner or @amccarth could be the LLVM person ? :)

jhenderson added inline comments.Apr 4 2019, 6:25 AM
include/llvm/ObjectYAML/MinidumpYAML.h
70

Don't know about lifetime here, but could this be a StringRef?

unittests/Object/MinidumpTest.cpp
270

What about a case where the size field cannot fit in the remaining data?

labath marked 3 inline comments as done.Apr 4 2019, 6:44 AM
labath added inline comments.
include/llvm/ObjectYAML/MinidumpYAML.h
70

A StringRef would work when constructing this object from YAML (as there the yaml file can provide backing storage). However, when constructing this from a Minidump binary, that would not work, as the data is stored there in the utf16 form (so not the kind of "string" we usually work with).

It would probably be possible to pull of some kind of a trick like yaml::BinaryRef does, where this would be a special object, which could be backed either by a real (utf8) string, or by the minidump utf16 thingy. However, I don't think that would be worth the trouble, as these strings are fairly small. (Also, in order to validate the utf16 data (so I don't end up creating an invalid object and crash during serialization), I'd have to essentially decode the data into a std::string anyway and then throw the result away.)

unittests/Object/MinidumpTest.cpp
270

Ah yes, good catch. I'll add that.

labath updated this revision to Diff 193698.Apr 4 2019, 6:51 AM
labath marked an inline comment as done.

add the size-does-not-fit test case

This revision is now accepted and ready to land.Apr 4 2019, 7:48 AM
This revision was automatically updated to reflect the committed changes.