This patch improves performance of SymbolFilePDB on huge executables in two ways:
- cache names of public symbols by address. When creating variables we are trying to get a mangled name for each one, and in GetMangledForPDBData we are enumerating all public symbols, which takes O(n) for each variable. With the cache we can retrieve a mangled name in O(log(n));
- cache section contributions. When parsing variables for context we are enumerating all variables and check if the current one is belonging to the current compiland. So we are retrieving a compiland ID for the variable. But in PDBSymbolData::getCompilandId for almost every variable we are enumerating all section contributions to check if the variable is belonging to it, and get a compiland ID from the section contribution if so. It takes O(n) for each variable, but with caching it takes about O(log(n)). I've placed the cache in SymbolFilePDB and have created GetCompilandId there. It actually duplicates PDBSymbolData::getCompilandId except for the cache part. Another option is to support caching in PDBSymbolData::getCompilandId and to place cache in DIASession, but it seems that the last one doesn't imply such functionality, because it's a lightweight wrapper over DIA and whole its state is only a COM pointer to the DIA session. Moreover, PDBSymbolData::getCompilandId is used only inside of SymbolFilePDB, so I think that it's not a bad place to do such things. With this patch PDBSymbolData::getCompilandId is not used at all.
This bottlenecks were found with profiling. I've discovered these on a simple demo project of Unreal Engine (x86 executable ~72M, PDB ~82M).
This patch doesn't change external behavior of the plugin, so I think that there's no need for additional testing (already existing tests should warn us about regress, if any).