This is an archive of the discontinued LLVM Phabricator instance.

Move Stream code from MSF -> Support
ClosedPublic

Authored by zturner on Feb 22 2017, 12:01 PM.

Diff Detail

Repository
rL LLVM

Event Timeline

zturner created this revision.Feb 22 2017, 12:01 PM
zturner updated this revision to Diff 89424.Feb 22 2017, 2:35 PM

Moved specification of the endianness from the individual call to read to the constructor of the Reader / Writer. This mimics what DataExtractor does, makes more sense from a conceptual standpoint since you will rarely want to read mixed endian values from a single stream, and also solves one of the issues preventing the Stream Mapper class from being generalized.

Note that it's still possible to read/write mixed endian values, you just have to do it via readObject by passing a packed_endian_specific_integer etc, instead of by using readInteger.

zturner updated this revision to Diff 89444.Feb 22 2017, 4:57 PM

Fixed a bunch of bugs that surfaced when I tried to add tests for BinaryItemStream. Yay tests!

zturner updated this revision to Diff 89450.Feb 22 2017, 5:23 PM

Some changes to ArrayRef.h had snuck in that were both incorrect and unnecessary. Removed them.

zturner updated this revision to Diff 89539.Feb 23 2017, 11:34 AM

Added some runtime alignment checks in all of the functions that reinterpret_cast bytes to object pointers. This caught 2 instances of UB (yay!), so I've fixed those as well.

Ping, anyone have any thoughts? So far we have identified at a minimum LLDB, LLVM PDB library, and LLVM libobject that can make use of this, with potential implications all over the LLVM DWARF reading and writing code where it uses DataExtractor. IMO it also looks useful any time we're reading or writing any kind of binary format as well so that people don't have to keep rolling their own code (so potentially everywhere).

beanz edited edge metadata.Feb 24 2017, 10:07 AM

This all looks pretty great to me.

echristo accepted this revision.Feb 24 2017, 11:11 AM

Couple of inline comments while I was reading through. Hope we can use this to clean up some of the reading/writing interfaces in future patches :)

-eric

llvm/include/llvm/Support/BinaryStreamReader.h
30–31 ↗(On Diff #89539)

Grammar.

llvm/lib/Support/BinaryStreamWriter.cpp
28 ↗(On Diff #89539)

This seems like you want to say "writeTerminatedString"?

I'd probably go with "writeCString" or "writeString" and use "writeUnterminatedString"? Or something?

42 ↗(On Diff #89539)

Don't we already update the offset in writeBytes?

This revision is now accepted and ready to land.Feb 24 2017, 11:11 AM
zturner added inline comments.Feb 24 2017, 11:18 AM
llvm/lib/Support/BinaryStreamWriter.cpp
28 ↗(On Diff #89539)

Yea, I could call it something else. But this function is a little weird in that the input itself need not be null terminated. So even if Str is not null terminated, it will write the null terminator anyway. I'm not sure if changing the name to writeCString would make it confusing. And writeNullTerminatedString would be too long. So writeZeroString or writeTerminatedString seems the best, but I'm not tied to either one (Zero just seems the least verbose and also specifies what the terminator is, rather than it being some arbitrary character)

42 ↗(On Diff #89539)

If you were to call this->writeBytes then yes it would update the offset. This writes them directly to the underlying stream (which doesn't maintain an offset, only the writer maintains the offset).

But this raises the question of why don't I just call this->writeBytes directly. Good question! I'll fix that.

echristo added inline comments.Feb 24 2017, 11:25 AM
llvm/lib/Support/BinaryStreamWriter.cpp
28 ↗(On Diff #89539)

*shrug* If it becomes a problem we can also just rename it :)

Just "writeZeroString" seems to say "000000000" to me...

42 ↗(On Diff #89539)

... Oh, ok.

Yeah, that's confusing :)

zturner updated this revision to Diff 89703.Feb 24 2017, 11:45 AM

Fixed suggestions from echristo and other minor improvements (for example, I added a function to read/write multiple integers at once which is more efficient than doing them one at a time).

This revision was automatically updated to reflect the committed changes.
llvm/trunk/tools/llvm-pdbdump/llvm-pdbdump.cpp