This is a huge patch, so instead of asking you guys to look at it in detail, I'll summarize the highlights (I also have 7 other smaller patches leading up to this which are not in yet. Point being that I tried as hard as I could to break it down into something smaller, but this is the best I could do). Mostly looking for high level feedback. Even this is long, but it's better than reading a 5,000 line patch :)
Problem: A separation started to occur between reading interfaces and writing interfaces. E.g. PDBFile, DbiStream, InfoStream, etc were the reading interfaces, and PDBFileBuilder, MsfBuilder, InfoStreamBuilder, etc were the writing interfaces. But due to the coupling of the reading interfaces and the underlying file buffer, the only way to write changes back out to disk was to do so through the reading interface. So we had functions like PDBFile::commit(), and DbiStream::commit(), even though these were supposed to be the reading interfaces. The way you would use the api is to create a PdbFileBuilder, fill out the fields, call PdbFileBuilder::build() on it and that would return you a PDBFile. Then you would call commit() on it.
So the problem was to get these commit() methods over to the builders, so that the reading interfaces would be truly read only.
Solution: Until this patch, there was no clear distinction between interfaces which could read and interfaces which could write. Everything could read and everything could write. This causes a problem though, because your backing data is not always writable to begin with. For example, when reading a PDB from disk and dumping it, we read it into a MemoryBuffer, which is immutable. So we would have to use hacks like returning an llvm::Error if the backing store is not writable. It's sort of analagous to if everything was a MutableArrayRef<>, but methods returned an error if the underlying memory was const. Ideally you want the types to dictate what operations are possible.
With that in mind, I split all the stream based types into readable streams, writable streams, and read/write streams. This is supported by the following fundamental types:
/// An abstract base class which defines an interface for reading data from an arbitrary backing store which may or may not be contiguous. class ReadableStream /// An abstract base class which defines an interface for writing data to an arbitrary backing store which may or may not be contiguous. class WritableStream /// A stream which is backed by a source which is both readable and writable. /// This is mostly equivalent to what was formerly known as `StreamInterface`. class ReadWriteStream : public ReadableStream, public WritableStream
These are the basis for everything. From here there are concrete implementations that are backed by contiguous in-memory byte sequences (ByteStream and MutableByteStream) which are useful when you are building your own streams, or when you have a single buffer which contains code view type data, etc.
There are concrete implementations that are backed by discontiguous block streams in an Msf file (MappedBlockStream and WritableMappedBlockStream).
There are concrete implementations that are backed by disk files (FileBufferByteStream) that are useful when you want to commit to a disk.
But it's not always useful to just have an entire stream. Sometimes you just want part of a stream. For example, maybe starting at offset 364 in a big stream is a "substream". We want some way to get a smaller view over a larger stream. For this we have ReadableStreamRef, WritableStreamRef, and ReadWriteStreamRef. StreamRefs are to Streams as ArrayRefs are to arrays. They can be trivially copied and passed by value, and provide convenience methods for dropping items, slicing, etc. But they are backed by ReadableStream, WritableStream, and ReadWriteStream, which again means they can refer to any kind of stream, whether it be a ByteStream, a MappedBlockStream, or any other kind of stream.
Finally, we have StreamReader and StreamWriter. While the underlying stream only knows how to read and write byte sequences, StreamReader and StreamWriter know how to read *types*. integers, objects, strings, etc. They wrap ReadableStreamRef and WritableStreamRef respectively, and maintain an offset into the stream where the next read or write will occur at.
these abstractions largely mirror the abstractions of before this patch, with the exception that reading and writing responsibilities are enforced by the type system.
There is one breaking change to be aware of between this patch and the previous patch. Previously, MappedBlockStream (which supported both reading and writing at the time) handled the case where you read across a block boundary by allocating from an internal pool. And if you ever requested that offset again, it would return the value allocated from the pool. Since it also supported writing, it had to handle the case where you overwrote some of the values that were previously cached from the pool. Because otherwise, if someone were to try to read that value again, they would see it was cached and return the stale value from the cache. So MappedBlockStream was smart enough to deal with this.
But now, MappedBlockStream is read-only and WritableMappedBlockStream is write-only, and just like if you make 2 unique_ptrs to the same data, bad things will happen. The only use case I had found for this advanced cache coherency was in simplicity of writing tests, so it is now up to the user to make sure he/she coordinates reading and writing appropriately and explicitly clears the read cache after any write is performed on the same backing stream.