This is an archive of the discontinued LLVM Phabricator instance.

[pdb/lld] Write a valid FPM
ClosedPublic

Authored by zturner on Aug 2 2017, 11:45 AM.

Details

Summary

Prior to this patch, we did not emit a free page map to the PDB. Instead, we just wrote 0 bytes for the free page map. Since an a free page map represents a minimum of BlockSize * 8 blocks, this had the effect of saying that every PDB is a minimum of 4096*8*4096 = 134,217,728 bytes. This had undesirable side effects when re-linking our PDB with MSVC, which would output a PDB of slightly above the aforementioned size.

The fix for this is to simply write the actual bits of the FPM to appropriate block, but it is not quite that simple. Since one block of FPM data can only describe the state of 4096*8 = 32,768 blocks, supporting PDBs larger than this requires extra magic. To handle this, blocks of the form {1,2} + k * BlockSize are reserved and should not be allocated to any other type of PDB data. The FPM itself can then be viewed as a stream over the blocks {1 + 4096*0, 1 + 4096*1, 1 + 4096*2, ...} or {2 + 4096*0, 2 + 4096*1, 2 + 4096*2, ...} depending on the value of SuperBlock->FreeBlockMapBlock.

So we need some extra complexity that when allocating blocks from a stream, we check if the MSF file requires growing to vend the requested blocks. If it does, then we look to see if the growth would allocated blocks with the aforementioned indices, and if so we mark them as used in the FPM so that they will not be vended by the block allocator.

But that's not all! Another source of trickiness is in the fact that not all FPM blocks are even used. For example, if a PDB file has 5000 blocks, and SuperBlock->FreeBlockMapBlock == 1 , then blocks 1, 2, 4097, and 4098 are FPM blocks but only block 1 is allocated. And from block 1, only 625 bytes are needed. But blocks 2, 4097, and 4098 still need to be initialized to 0xFF, as do the extraneous bytes of block 1. Getting this to work invisibly while allowing the user of the API to see only those bytes of the FPM that describe actual blocks required some finesse.

Diff Detail

Repository
rL LLVM

Event Timeline

zturner created this revision.Aug 2 2017, 11:45 AM
rnk added inline comments.Aug 2 2017, 1:07 PM
llvm/include/llvm/DebugInfo/MSF/MSFCommon.h
115 ↗(On Diff #109390)

Can you refactor this alignTo(X, Y) / Y pattern into a helper like divideRoundUp or divideCeil? The usual way to do that is (X + Y - 1) / Y. alignTo does stuff we don't need.

llvm/lib/DebugInfo/PDB/Native/PDBFileBuilder.cpp
158 ↗(On Diff #109390)

Why not make this part of MSFBuilder? This doesn't use any PDBFileBuilder state, just an MSFLayout.

167–169 ↗(On Diff #109390)

Why do you need to recreate the stream and writer?

zturner added inline comments.Aug 2 2017, 1:17 PM
llvm/include/llvm/DebugInfo/MSF/MSFCommon.h
115 ↗(On Diff #109390)

Yea I can add that.

llvm/lib/DebugInfo/PDB/Native/PDBFileBuilder.cpp
158 ↗(On Diff #109390)

MSFBuilder doesn't actually write any bytes, it just returns some bookeeping information (specifically, an MSFLayout structure) that describe the the MSF. It might be worth doing some refactoring and changing MSFBuilder::build() to MSFBuilder::commit() and write the super block, stream directory, and other MSF specific stuff, but I'd rather do that separately. PDBFileBuilder::commit() already writes all of that other MSF specific stuff anyway, so we might as well be consistent.

167–169 ↗(On Diff #109390)

The stream doesn't have methods to update its length, so we need a new stream, and the writer doesn't have a method to set the stream to a new stream.

zturner added inline comments.Aug 2 2017, 1:19 PM
llvm/lib/DebugInfo/PDB/Native/PDBFileBuilder.cpp
167–169 ↗(On Diff #109390)

BTW, the cost of "making a new stream" is not high. Nothing is really happening other than allocating a stream object and setting some of its fields. So we're wasting one (small) heap allocated object, but there's no copying or anything going on.

rnk added inline comments.Aug 2 2017, 1:24 PM
llvm/lib/DebugInfo/PDB/Native/PDBFileBuilder.cpp
167–169 ↗(On Diff #109390)

Are you sure? createFpmStream above seems to initialize the stream to 0xFF, so it's probably like a 4K memset at least.

zturner added inline comments.Aug 2 2017, 1:26 PM
llvm/lib/DebugInfo/PDB/Native/PDBFileBuilder.cpp
167–169 ↗(On Diff #109390)

Ahh, you are right. I was confusing this code with code somewhere else. This line shouldn't be here. I used to have the initialization to 0xFF code here, so there was code to read the long version and then the short version. Then I sunk all that code into the createFpmStream method and didn't clean up properly here. Nice catch.

zturner updated this revision to Diff 109440.Aug 2 2017, 3:24 PM

Updated from suggestions.

rnk accepted this revision.Aug 2 2017, 3:29 PM

lgtm

This revision is now accepted and ready to land.Aug 2 2017, 3:29 PM
This revision was automatically updated to reflect the committed changes.