This is an archive of the discontinued LLVM Phabricator instance.

pdbdump: Dump Free Page Map contents.
ClosedPublic

Authored by ruiu on Jul 29 2016, 1:13 PM.

Details

Diff Detail

Repository
rL LLVM

Event Timeline

ruiu updated this revision to Diff 66166.Jul 29 2016, 1:13 PM
ruiu retitled this revision from to pdbdump: Dump Free Page Map contents..
ruiu updated this object.
ruiu added a reviewer: zturner.
ruiu added a subscriber: llvm-commits.
majnemer added inline comments.
lib/DebugInfo/PDB/Raw/PDBFile.cpp
49–52 ↗(On Diff #66166)

Would SparseBitVector be a more appropriate API to use?

53–56 ↗(On Diff #66166)

This will only read a single block right?
If the block size was 4 KB, would that mean that the FPM can only account for 128 MB?
This seems small.

Microsoft's code suggests that the FPM could span two blocks:
https://github.com/Microsoft/microsoft-pdb/blob/master/PDB/msf/msf.cpp#L222

zturner added inline comments.Jul 29 2016, 1:26 PM
lib/DebugInfo/PDB/Raw/PDBFile.cpp
52 ↗(On Diff #66166)

How about adding another field to the MsfLayout class called FreePageMap? You will need to set the value in MsfBuilder::build() before returning, and you will also need to set it in PDBFile::parseFileHeaders().

This way you wouldn't need to return an Expected<std::vector<uint32_t>> but rather a const BitVector&, and the reading / writing code would share the same common structures.

53–56 ↗(On Diff #66166)

It doesn't span 2 pages. But it's *either* page 1 or page 2. Which one it is depends on the second field in the super block. But whichever one it is, it's only that 1 page.

majnemer added inline comments.Jul 29 2016, 1:31 PM
lib/DebugInfo/PDB/Raw/PDBFile.cpp
53–56 ↗(On Diff #66166)

Hmm, but there is also:

In the interest of the on-disk size of the file, the FPM has been split across the file at regular intervals, with new intervals introduced as needed.

https://github.com/Microsoft/microsoft-pdb/blob/master/PDB/msf/msf.cpp#L49

zturner added inline comments.Jul 29 2016, 1:32 PM
lib/DebugInfo/PDB/Raw/PDBFile.cpp
53–56 ↗(On Diff #66166)

Yea, but read the very next line:

According to hdr.pnFpm, the first or the second free page map is valid.

What I'm not sure about is this line:

In the interest of the on-disk size of the file, the FPM has been split across the file at regular intervals, with new intervals introduced as needed
ruiu added inline comments.Jul 29 2016, 1:33 PM
lib/DebugInfo/PDB/Raw/PDBFile.cpp
53–56 ↗(On Diff #66166)

At least, the test PDB file I was using contains a seemingly valid FPM in FPM2, and FPM1 contents doesn't seem to make sense.

zturner added inline comments.Jul 29 2016, 1:35 PM
lib/DebugInfo/PDB/Raw/PDBFile.cpp
53–56 ↗(On Diff #66166)

Wait, I think I get it. As you deduced, the maximum number of a single page of bits can represent is BlockSize * 8. But the SuperBlock tells you exactly how many pages are in the file. If the number of pages in the SuperBlock are more than the number that can be represented by a single page of bits, then there will be a page in the middle of the file that "continues" the Free Page Map. It will probably be precisely the page with index BlockSize * 8

majnemer added inline comments.Jul 29 2016, 1:36 PM
lib/DebugInfo/PDB/Raw/PDBFile.cpp
53–56 ↗(On Diff #66166)

Yeah, that seems to match my reading of things. Would be good to check with a big pdb like chrome_child.

ruiu added inline comments.Jul 29 2016, 1:39 PM
lib/DebugInfo/PDB/Raw/PDBFile.cpp
52 ↗(On Diff #66166)

What is a bit annoying is that BitVector doesn't provide a way to map itself onto an existing bitmap, so I need to examine each bit and flip a bit in a BitVector. But, because the number of bits we have here is small anyways, it is probably okay to copy bit by bit. I'll try to add BitVector to MsfLayout.

zturner added inline comments.Jul 29 2016, 1:41 PM
lib/DebugInfo/PDB/Raw/PDBFile.cpp
52 ↗(On Diff #66166)

It seems like we could just add a new constructor to the BitVector class like this:

BitVector(BitWord *Buffer, uint32_t NumBits);

ruiu added inline comments.Jul 29 2016, 2:00 PM
lib/DebugInfo/PDB/Raw/PDBFile.cpp
52 ↗(On Diff #66166)

BitWord is an internal thing of BitVector, so it probably should be uint8_t * and NumBits, or ArrayRef<uint8_t>. I'll try to see if I can add a new constructor.

majnemer added inline comments.Jul 29 2016, 2:01 PM
lib/DebugInfo/PDB/Raw/PDBFile.cpp
52 ↗(On Diff #66166)

It already has BitVector::setBitsInMask.

ruiu added inline comments.Jul 29 2016, 2:09 PM
lib/DebugInfo/PDB/Raw/PDBFile.cpp
52 ↗(On Diff #66166)

I took a look at the function and noticed that it takes a pointer to uint32_t. So the endianess matters here, so I cannot use it for an array read from file (unless swapping bytes before passing it to setBitsInMask).

majnemer added inline comments.Jul 29 2016, 2:22 PM
lib/DebugInfo/PDB/Raw/PDBFile.cpp
52 ↗(On Diff #66166)

You could just loop and call setBitsInMask.

ruiu updated this revision to Diff 66175.Jul 29 2016, 2:27 PM
  • Move FPM to MsfLayout.
  • Use BitVector to represent FPM.
ruiu added inline comments.Jul 29 2016, 2:29 PM
lib/DebugInfo/PDB/Raw/PDBFile.cpp
52 ↗(On Diff #66175)

How?

zturner edited edge metadata.Jul 29 2016, 2:32 PM

In MsfBuilder::build(), can you set the value of FreePageMap before returning?

lib/DebugInfo/PDB/Raw/PDBFile.cpp
52 ↗(On Diff #66175)

All we really want is a simple memset though. The function is already there init_words(), it's just private.

zturner accepted this revision.Jul 29 2016, 2:38 PM
zturner edited edge metadata.
This revision is now accepted and ready to land.Jul 29 2016, 2:38 PM
This revision was automatically updated to reflect the committed changes.