This is an archive of the discontinued LLVM Phabricator instance.

[pdb] Try super hard to conserve memory in llvm-pdbdump
ClosedPublic

Authored by zturner on May 25 2016, 5:07 PM.

Details

Summary

We map the entire PDB into the process, and then when reading various streams, in order to guarantee that we have certain structures in a contiguous format, we read them out of the stream and order them appropriately, essentially copying the bytes.

This is incredibly memory inefficient, as it essentially means that loading a PDB into memory will use roughly double the file size since almost all important data structures are copied.

To address this, I've introduced a number of steps:

  1. Introduce a class called StreamView. This is analagous to an ArrayRef. It provides a limited window on top of a larger stream (which can be any type of stream, including another StreamView. This is useful for constraining stream operations to specific substreams or fields, for example when one large stream is broken into multiple logical sections.
  1. Introduce a set of 3 "stream data structures". These are currently StreamString, FixedStreamArray<T>, and VarStreamArray. These classes all share the same underlying purpose: Try to return references to values in the source byte stream if it can be done contiguously, but if not, copy them into temporary storage. The first one wraps a String, the second one wraps an array of fixed size records, and the third one wraps an array of variable length records. VarStreamArray will prove particularly useful, because in order to return a reference to the data in the source byte stream, the entire array need not be contiguous, only the single record being requested.
  1. Update the StreamReader class to be able to read values of type StreamString, FixedStreamArray<T>, and VarStreamArray.

I updated DBIStream to use these new classes in a number of places, but currently there is not much memory savings because it's not yet being used on the type and symbol records stream, which comprise 95% of the file size. I plan to do that in a subsequent patch but I just wanted to get the infrastructure in place first.

Diff Detail

Event Timeline

zturner updated this revision to Diff 58540.May 25 2016, 5:07 PM
zturner retitled this revision from to [pdb] Try super hard to conserve memory in llvm-pdbdump.
zturner updated this object.
zturner added reviewers: rnk, ruiu.
zturner added a subscriber: llvm-commits.
zturner updated this object.May 25 2016, 5:09 PM
ruiu edited edge metadata.May 25 2016, 5:39 PM

A more radical idea would be to call mmap multiple times with explicit "map-to" addresses to map parts of a PDB file to stitch the blocks that consist a stream together in the virtual memory space. Then you could use regular StringRef and ArrayRef, and it would probably be very fast. But this wouldn't work if your machine's page size is larger than the file's page size. So I don't know if it's feasible.

include/llvm/DebugInfo/CodeView/StreamArray.h
24

Is "discontiguous blocks in a file" better?

97

It will construct an object of type T. If T's constructor needs a parameter, it wouldn't compile. You want to use

mutable uint8_t TempItem[sizeof(T)];

instead for a buffer.

lib/DebugInfo/CodeView/StreamArray.cpp
20 ↗(On Diff #58540)

So RecordOffsets maps each byte position to each byte position, correct? I wonder if it's going to be too much. If you access sequentially from 0 to 100MB using operator[], is it going to insert 100 million items to the hash?

51 ↗(On Diff #58540)

Please add newline at end of file.

lib/DebugInfo/CodeView/StreamString.cpp
1 ↗(On Diff #58540)

I wouldn't check this file in until I need to. ".h" only should be fine.

zturner updated this revision to Diff 58573.May 25 2016, 11:57 PM
zturner edited edge metadata.

Updated based on suggestions

In addition to the above comments, I went ahead and fixed VarStreamArray to work, and I am actually using it to iterate over the ModInfos that we were previously using a custom iterator for. In order to make this work, I had to introduce a new StreamObject class. In the future we may even be able to simplify the code of FixedStreamArray<T> by making it internally use StreamObject<T> to handle some of the logic.

Since this class now works correctly for the ModuleInfo array, I expect it will be simple to extend it to the symbol and type streams, but I want to do that in a followup patch.

include/llvm/DebugInfo/CodeView/StreamArray.h
98

If T has a non-trivial constructor, this class won't work anyway because we are reinterpret-casting a raw byte stream to the object type. So, in fact, this class *should* fail to compile for classes with non-trivial constructors.

lib/DebugInfo/CodeView/StreamArray.cpp
20 ↗(On Diff #58540)

Yea, maybe so. I think at some point we may need something like this, but on the other hand maybe we can use the integrated hash table for those. For now I've removed this and we support forward only iteration.

ruiu added inline comments.May 26 2016, 10:51 AM
include/llvm/DebugInfo/CodeView/StreamObject.h
26 ↗(On Diff #58573)

sizeof(T) is a compile-time constant. Do you need to allocate it dynamically?

lib/DebugInfo/CodeView/StreamReader.cpp
76–79

This seems to be copying the string contents to the local std::string object whether the source string is contiguous in memory or not. Am I missing something?

zturner added inline comments.May 26 2016, 11:02 AM
include/llvm/DebugInfo/CodeView/StreamObject.h
26 ↗(On Diff #58573)

I did this because I didn't want sizeof(StreamObject) to be >= sizeof(T). It's probably minor, but in any case I should be able to fix this after switching to the pooled allocator.

lib/DebugInfo/CodeView/StreamReader.cpp
76–79

This was kind of a weird function to write. Yes, it copies the source string to a temporary std::string object. It does this in order to figure out how long the string is. The logic of scanning one by one for a \0 is already handled in the ovther overload of this function and I didn't want to reproduce it. So after it returns the string in a temporary, it can calculate the length N. Then, it tries to get N bytes contiguously (see the call to getArrayRef below. If it works, yay, make a StreamString with a reference to those N bytes. If it didn't work, make a StreamString using the temporary std::string (which will make a copy of the temporary variable into its own internal std::string, and then return a reference to that).

In any case, I'm re-working all this stuff to use the pooled allocator, because since using the temporary storage is so rare, it doesn't make sense to have the overhead of storing a whole std::string every time.

rnk edited edge metadata.May 26 2016, 11:10 AM

We discussed in person structuring this so we don't need special stream classes to refer to data from the file or from a memory pool on the side.

include/llvm/DebugInfo/PDB/Raw/ModInfo.h
48–49

Let's try to arrange things so these are just plain StringRefs pointing into the file or optionally a pool on the side.

zturner updated this revision to Diff 58683.May 26 2016, 1:42 PM
zturner edited edge metadata.

Updated based on suggestions. With this patch I've deleted StreamObject and StreamString, and fallen back to raw pointers and StringRefs. MappedBlockStream uses a BumpPtrAllocator to allocate space for contiguous memory only in situations where the read would otherwise have crossed a block boundary.

We also removed many uses of ByteStream (a source of memory copying), but a few still remain, in particular with the symbol and type streams since they are more difficult.

I did however get rid of the ByteStream used for the module file info array in DBIStream, and am using VarStreamArray instead, which takes advantage of these zero-copy improvements. This is to show that it works in practice before we move on to applying it to the rest of the streams.

zturner updated this revision to Diff 58692.May 26 2016, 1:54 PM

Cache addresses which were previously allocated from the pool so that we don't keep allocating from the pool multiple times for the same stream offset.

ruiu added a comment.May 26 2016, 2:43 PM

Generally looking good. I think I now fully understand how it works.

include/llvm/DebugInfo/CodeView/StreamArray.h
127–129

If this should never happen, you want to make it an assertion instead of swallowing errors.

lib/DebugInfo/CodeView/StreamReader.cpp
76–79

This looks much better now.

lib/DebugInfo/PDB/Raw/MappedBlockStream.cpp
39–40

How is it likely that a object we want to fetch spans two blocks but the two blocks are contiguous in the file? If it is likely, you might want to optimize for that case. (In that case you don't need to copy contents.)

zturner added inline comments.May 26 2016, 4:30 PM
lib/DebugInfo/PDB/Raw/MappedBlockStream.cpp
39–40

A quick use of llvm-pdbdump -raw-stream-blocks on a large PDB shows this is actually surprisingly common. In fact, streams blocks are almost always contiguous. So this will be a really good optimization. I'll make that change and upload the patch.

zturner updated this revision to Diff 58724.May 26 2016, 5:30 PM

Updated based on suggestions. Now can return cross-block references as long as the referenced blocks are contiguous. Also got a big memory savings by not reading the entire names buffer in the NameHashTable class as a giant StringRef. Instead I now just treat it as a StreamRef and read individual streams from it.

With this patch across all streams of a 9MB PDB, the total amount of memory copied is less than 1,000 bytes.

ruiu accepted this revision.May 26 2016, 5:44 PM
ruiu edited edge metadata.

LGTM

lib/DebugInfo/PDB/Raw/MappedBlockStream.cpp
63

Please write a comment for this function that the function is to try to read contents if they are contiguous in file.

This revision is now accepted and ready to land.May 26 2016, 5:44 PM
zturner closed this revision.May 26 2016, 8:13 PM