To allow loading PDB file with target symbols add command.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
How can I start lldb at specified path so that target symbols add bar.pdb could find the pdb path? (For testing)
I think this is a good start. See inline comments for details. In addition to the test for the "target symbols add" flow it would be good to have a separate test for the ObjectFilePDB functionality. You can look at existing tests in test/Shell/ObjectFile for inspiration.
| lldb/source/Plugins/ObjectFile/PDB/ObjectFilePDB.cpp | ||
|---|---|---|
| 11 | You should be able to delete this line too clang-format should know to put this file first. | |
| 46 | There's also a third copy of this code in ProcessMinidump (MinidumpParser.cpp). I think it's time to factor this out somehow. We could either create a new file in the Utility module, or put this functionality in directly inside the UUID class -- I'm not currently sure what's better. | |
| 173 | I'm pretty sure this check is bogus and can never be false. | |
| 177–198 | Could this use the same algorithm as ObjectFilePDB::GetArchitecture ? | |
| 208–210 | Leave this unimplemented. PDBs don't have sections and are not loaded into memory, so the function does not apply. | |
| 216–217 | This shouldn't be necessary. What "normally" happens here is that an object file is adding own sections *into* the "unified" section list -- not the other way around. Since we're pretending that the pdb file has no sections (and that's kind of true) I think you should just return an empty section list here, and not touch the unified list at all. | |
| lldb/source/Plugins/ObjectFile/PECOFF/ObjectFilePECOFF.cpp | ||
| 145 | revert spurious changes. | |
| lldb/source/Plugins/SymbolFile/NativePDB/SymbolFileNativePDB.cpp | ||
| 267–268 | Instead of changing loadMatchingPDBFile I think we ought to change this to something like: if (auto *pdb = llvm::dyn_cast<ObjectFilePDB>(m_objfile_sp)) {
  file_up = pdb->giveMeAPDBFile(); 
  // PS: giveMeAPDBFile is a new pdb-specific public method on ObjectFilePDB
  // PS: possibly this should not return a PDBFile, but a PDBIndex directly -- I don't know the details but the general idea is that it could/should reuse the pdb parsing state that the objectfile class has already created
} else 
  file_up = do_the_loadMatchingPDBFile_fallback();The reason for that is that it avoids opening the pdb file twice (once in the object file and once here). Another reason is that I believe the entire loadMatchingPDBFile function should disappear. Finding the pdb file should not be the job of the symbol file class -- we have symbol vendors for that. However, we should keep the fallback for now to keep the patch small... | |
| lldb/test/Shell/SymbolFile/NativePDB/load-pdb.cpp | ||
| 15 | I'd just pass commands to lldb directly via -o. | |
- address comments.
- add tests.
- move loadPDBFile from SymbolFileNativePDB.cpp to ObjectFilePDB.cpp and use it to create unique_ptr of PDBFile so that we can move it around.
| lldb/source/Plugins/ObjectFile/PDB/ObjectFilePDB.cpp | ||
|---|---|---|
| 177–198 | No. This part is the same algorithm as the part in ObjectFilePECOFF::GetModuleSpecifications, https://github.com/llvm/llvm-project/blob/master/lldb/source/Plugins/ObjectFile/PECOFF/ObjectFilePECOFF.cpp#L194-L215. The ObjectFilePDB::GetArchitecture is same as ObjectFilePECOFF::GetArchitecture. | |
| 208–210 | This is actually necessary. It's called at SymbolFileNativePDB::InitializeObject. Otherwise, setting breakpoint won't work. | |
| 216–217 | The reason why I copy section list from unified_section_list to m_sections_up is to let GetBaseAddress be able to return the right base address. | |
| lldb/source/Plugins/SymbolFile/NativePDB/SymbolFileNativePDB.cpp | ||
| 267–268 | From what I see, it seems like PDBFile is designed to be created when opening the pdb file. It doesn't allow copying it around. I moved loadPDBFile to ObjectFilePDB as static function, so we can move PDBFile easily. | |
| lldb/source/Plugins/ObjectFile/PDB/ObjectFilePDB.cpp | ||
|---|---|---|
| 177–198 | I'm not sure this kind of consistency actually helps. For example, in Module::MatchesModuleSpec, module's own architecture (as provided by ObjectFile(PECOFF)::GetArchitecture) is compared to the one in the ModuleSpec (generally coming from ObjectFile(PDB)::GetModuleSpecifications. If there's any mismatch between the two ways of computing a ArchSpec, then there will be trouble, even though they are "symmetric". That said, I think we can keep this like it is in this patch, but I'd like to clean up both implementations in a another patch. | |
| 208–210 | So, that's a bug in SymbolFileNativePDB::InitializeObject. It should be calling m_objfile_sp->GetModule()->GetObjectFile()->GetBaseAddress() (as done in e.g. SymbolFileBreakpad::GetBaseFileAddress) instead of m_objfile_sp->GetBaseAddress(). That will get you the base address of the main (exe) object file instead of the pdb. Up until now, that distinction was not important because m_objfile_sp was always the main file... | |
| lldb/source/Plugins/SymbolFile/NativePDB/SymbolFileNativePDB.cpp | ||
| 267–268 | You don't need to copy it -- you can just have the two objects reference the same instance (stealing the PDBFile object from under ObjectFilePDB seems suboptimal). Probably the simplest way to achieve that is to change the PdbIndex class to use a raw pointer (and ObjectFilePDB to provide one). Then SymbolFilePDB can have an additional unique_ptr<PDBFile> member, which may or may not be empty (depending on whether it fetches object from ObjectFilePDB or loads it itself). Eventually, as we transition to always fetching the PDBFile from ObjectFilePDB, that member will disappear completely. Also, I wasn't expecting you to take the giveMeAPDBFile name literally. :) I was deliberatly using a funky name as a placeholder because I wasn't sure of how exactly this would work -- I was trying to illustrate the general concept, and not a precise implementation. If what, I said in the previous paragraph works, then a "normal" name like GetPDBFile should work just fine... Sorry about the confusion. | |
| 267–275 | You should be able to delete this block now -- the code you're writing now supersedes (and enhances) it. If the module has a SymbolFileSpec, then this file will be loaded (as an ObjectFilePDB) in SymbolVendor::FindPlugin, and that instance will be passed to the SymbolFilePDB constructor. | |
| lldb/test/Shell/ObjectFile/PDB/object.test | ||
| 4 | Could we generate the pdb directly from yaml (perhaps via llvm-pdbutil yaml2pdb)? That way we could have a hardcoded UUID and explicit test for its parsing. | |
| lldb/test/Shell/ObjectFile/PDB/symbol.test | ||
| 4 ↗ | (On Diff #299826) | This is SymbolFile functionality, so it'd be better placed in Shell/SymbolFile/PDB | 
| lldb/source/Plugins/ObjectFile/PDB/ObjectFilePDB.cpp | ||
|---|---|---|
| 177–198 | So, we want to refactor out the common part of ObjectFilePECOFF::GetArchitecture and ObjectFilePDB::GetModuleSpecifications to keep them consistent? I will do it in another patch. | |
| 208–210 | Thanks for pointing it out. I'll update it and leave GetBaseAddress and CreateSections as unimplemented. | |
| lldb/source/Plugins/SymbolFile/NativePDB/SymbolFileNativePDB.cpp | ||
| 267–268 | 
 I think you meant SymbolFileNativePDB, but why do we need this? It already has a member unique_ptr<PDBIndex>. I might misunderstood. I changed it to have loadPDBFile return raw pointer and ObjectFilePDB has a raw pointer member and PdbIndex uses raw pointer. | |
| lldb/test/Shell/ObjectFile/PDB/object.test | ||
| 4 | I noticed when I use llvm-pdbutil pdb2yaml to generate yaml file from pdb file and then use llvm-pdbutil yaml2pdb to get pdb file back, the two pdb files are not the same. Also using lldb-test symbols with the pdb created from yaml file, got the error: error: Module has no symbol vendor. This doesn't happen if I use the original pdb file. | |
| lldb/test/Shell/ObjectFile/PDB/symbol.test | ||
| 4 ↗ | (On Diff #299826) | I put it here because I saw symbol.yaml which uses lldb-test symbols exists in Shell/ObjectFile/PECOFF. And the folder Shell/SymbolFile seems to contain tests for SymbolFile plugins. | 
| lldb/source/Plugins/ObjectFile/PDB/ObjectFilePDB.cpp | ||
|---|---|---|
| 177–198 | I'd actually say all four functions (ObjectFile{PECOFF,PDB}::{GetArchitecture,GetModuleSpecifications}) should use the same algorithm, but lets leave that for the other patch. | |
| lldb/source/Plugins/SymbolFile/NativePDB/PdbIndex.cpp | ||
| 42 | What's up with the reference? | |
| lldb/source/Plugins/SymbolFile/NativePDB/SymbolFileNativePDB.cpp | ||
| 267–268 | Yeah, but who's now responsible for deleting the PDBFile object? You just leak it, right? My proposal was the following: 
 The imagined implementation was something like: class ObjectFilePDB {
  std::unique_ptr<PDBFile> m_file_up; // Always non-null
  PDBFile &GetPDBFile() { return *m_file_up; }
};
class SymbolFilePDB {
  std::unique_ptr<PDBIndex> m_pdb_index;
  std::unique_ptr<PDBFile> m_file_up; // May be null
};
SymbolFilePDB::CalculateAbilities() {
  ...
  PDBFile *pdb_file;
  if (loading_from_objfile)
    pdb_file = &objfile->GetPDBFile();
  else {
    m_file_up = loadMatchingPDBFile(...);
    pdb_file = m_file_up.get();
  }
  m_index = PdbIndex::Create(*pdb_file);
}Other implementations are possible as well... | |
| lldb/test/Shell/ObjectFile/PDB/object.test | ||
| 4 | Yeah, unfortunately it looks like pdb2yaml does not support some pdb streams (though I also wonder if lldb is not being too strict about the kinds of streams it expects). It would be nice to fix that one day. Nonetheless, the current pdb2yaml functionality seems sufficient for the purposes of this particular test, so I think we should use that here. The hardcoded uuid here may be brittle as it depends on the exact algorithm that lld uses to compute it. | |
| lldb/test/Shell/ObjectFile/PDB/symbol.test | ||
| 4 ↗ | (On Diff #300133) | This is pretty strange way to use the pdb file and I'm surprised that it actually works (it probably shouldn't). The pdb file should not normally be the main object file of a module. You probably want something like lldb-test symbols %t.exe -symbol-file %t.pdb. | 
| 4 ↗ | (On Diff #299826) | The fact that that test uses the symbols function of lldb-test is not ideal, but the functionality it tests (symbol table parsing) is definitely an ObjectFile feature. OTOH, Parsing of Types, CompileUnits and Functions definitely falls into the SymbolFile purview. Given that the other test can be moved to yaml2pdb, and you don't have to worry about duplicating inputs, I'd move this test to the SymbolFile folder That said, I'm not sure what is it that you actually wanted to test here. What's being done here that is not covered by the load-pdb.test? You could potentially add -o "b main" (to force symfile parsing) and -o "image dump symfile" to the other test if you wanted to check that the pdb can actually be parsed this way. | 
| lldb/test/Shell/SymbolFile/NativePDB/load-pdb.cpp | ||
| 15 | It's simpler to just pass the executable as an argument instead of wrapping it in -o "target create". | |
Address comments.
| lldb/source/Plugins/ObjectFile/PDB/ObjectFilePDB.cpp | ||
|---|---|---|
| 177–198 | It's interesting that GetArchitecture and GetModuleSpecifications in ObjectFilePECOFF currently do not have the same algorithm, but works fine. | |
| lldb/source/Plugins/SymbolFile/NativePDB/PdbIndex.cpp | ||
| 42 | Removed reference. | |
| lldb/source/Plugins/SymbolFile/NativePDB/SymbolFileNativePDB.cpp | ||
| 267–268 | Thanks for declaration. | |
| lldb/test/Shell/ObjectFile/PDB/object.test | ||
| 4 | I switched to use yaml2pdb. The uuid is swapped byte order when reading from pdb. | |
| lldb/test/Shell/ObjectFile/PDB/symbol.test | ||
| 4 ↗ | (On Diff #299826) | I'll put the testing of symfile parsing in load-pdb.cpp. | 
This looks pretty good, both the patch and Pavel's insights. I don't see much to comment on that Pavel didn't already catch.
| lldb/source/Plugins/ObjectFile/PDB/ObjectFilePDB.cpp | ||
|---|---|---|
| 169 | For me, the name spec is confusing, because this code is mostly dealing with ModuleSpecs but spec is a reference to the ArchSpec of the module_spec. Perhaps module_arch would make this clearer. Then the code below would follow a pattern like: module_arch.SetTriple("blah");
specs.Append(module_spec); | |
lgtm
| lldb/source/Plugins/ObjectFile/PDB/CMakeLists.txt | ||
|---|---|---|
| 7 | I don't see this using any Host functionality. | |
| lldb/source/Plugins/ObjectFile/PECOFF/ObjectFilePECOFF.cpp | ||
| 145 | do revert this before committing. | |
| 855 | this too | |
| lldb/test/Shell/ObjectFile/PDB/object.test | ||
| 2 | Consider "inlining" the yaml input into this file. | |
It has regressed buildbots:
SymbolFile/NativePDB/load-pdb.cpp lldb-x86_64-fedora: http://lab.llvm.org:8014/#/builders/14/builds/1006 lldb-x86_64-debian: http://lab.llvm.org:8011/#/builders/68/builds/782
Just put a fix here: https://reviews.llvm.org/rG4b83747ab1577b7899116b0ddb75f05de2471694
This should fix it.
load-pdb is still failing on the Windows bot. Can you please commit a fix soon or revert the change?
It's slash vs back slash thing. Put a fix at https://reviews.llvm.org/rG779deb9750a4853485ac7beca86f518b067ad6d6
I don't see this using any Host functionality.