This change implements readFromExe, and calculating VA and RVA, which
are some of the functionalities that will be used for native PDB reading
for llvm symbolizer.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
The load address stuff looks fine.
I think the bit that tries to find the PDB file probably needs more flexibility or a bypass for the LLDB and Crash folks. I think we should include Pavel (labath) for his opinion on that bit. I recall there was lots of discussion about this in the Crash/Lexan meetings long ago.
llvm/lib/DebugInfo/PDB/Native/NativeSession.cpp | ||
---|---|---|
163 | I think the LLDB folks might be concerned about searching the file system for the corresponding PDB file. The debugger folks are probably going to want a way to override that for remote debugging, symbol servers, virtual file systems, etc. Perhaps the right thing to do is to have an API that takes paths to both the EXE and the PDB. If everything loads, great, otherwise return an error. Then you could have an API like this that tries to find the PDB, but rather than having it then load the session, it just forwards the PDB path to the two-path API. That would make it possible for a client to get the automatic behavior you have here while also enabling LLDB to be explicit about where the PDB should be. |
If I am not mistaken the "native" PDB plugin (which is the "future" for lldb ) does not use this code path at all -- it goes directly for the llvm::pdb::***Stream classes. As such, I don't think any special hooks need to be added here. Which isn't to say that searching the CWD unconditionally is a particularly nice interface for a library, but it is pretty similar to what the llvm::DWARFContext does...
llvm/lib/DebugInfo/PDB/Native/NativeSession.cpp | ||
---|---|---|
82–141 | Maybe rename Path to ExePath so the code below that replaces the exe filename with the pdb filename is a bit more readable. | |
98 | I see what you mean that it was mostly unimplemented. :) | |
163 | Building on Adrian's idea, I think the API might look kind of like this: struct PdbSearchOptions { std::string ExePath; // Options we can add later: // std::vector<std::string> SearchDirs; // bool UseEnvNtSymbolPath = false; // llvm::VirtualFileSystem *SearchVFS = nullptr; // Way to remap paths }; Expected<std::string> searchForPdb(const PdbSearchOptions &Opts) { ... } The search would return the first existing file with matching PDB magic. The process of searching for a PDB file can be quite involved. Here are some things that we may implement one day:
Even if we don't implement the symsrv protocol, it would be useful to teach LLVM how to find PDBs in a symbol server cache. For example, I have these kernel32.dll PDBs in C:\symbols, which windbg put them: $ ls /c/symbols/kernel32.pdb/* /c/symbols/kernel32.pdb/548CAA89F1FCF0193D12D2DC02F8920E1: kernel32.pdb /c/symbols/kernel32.pdb/63816243EC704DC091BC31470BAC48A31: kernel32.pdb If llvm-symbolizer wants to be able to symbolize system DLLs, that will be our best bet. | |
213 | Checking the sign bit makes sense, but these seem like simpler ways to express that:
| |
221 | If we can subtract the virtual address out of it, are you sure it is an RVA? It seems like it's supposed to be a virtual address... If it's a virtual address, though, I would expect it to require a uint64_t. Hm. | |
llvm/unittests/DebugInfo/PDB/Inputs/SimpleTest.cpp | ||
5 | I tried to think of how to use yaml2pdb to avoid the binary files, but I don't like any of my solutions. I'm OK with the binary files. |
llvm/lib/DebugInfo/PDB/Native/NativeSession.cpp | ||
---|---|---|
221 | I think the section header virtual addresses are also relative to the load address. |
Thanks. I appreciate the separation of concerns with regard to finding and loading of the PDB.
Hi, llvm/object/COFF.h does not work on a case-sensitive filesystem. I fixed it in 4927ae085807731eb4052e0a682443fe9399b512. BTW, you can strip unneeded Phabricator tags (Subscribers: Tags: Reviewers:) with
arcfilter () { arc amend git log -1 --pretty=%B | awk '/Reviewers:|Subscribers:/{p=1} /Reviewed By:|Differential Revision:/{p=0} !p && !/^Summary:$/ {sub(/^Summary: /,"");print}' | git commit --amend -F - }
http://45.33.8.238/linux/15836/step_12.txt
After fixing the -DLLVM_ENABLE_ABI_BREAKING_CHECKS=on problem locally, the unittest (unittests/DebugInfo/PDB/DebugInfoPDBTests) still fails.
(
An erroring Expected must be explicitly checked to avoid -DLLVM_ENABLE_ABI_BREAKING_CHECKS=on failures.
if (Expected<std::unique_ptr<PDBFile>> File = loadPdbFile(PdbPath, Allocator)) return std::string(PdbPath); else return File.takeError();
)
Even more, as the test reads the PDB file path from that EXE file (llvm-project/llvm/unittests/DebugInfo/PDB/Inputs/SimpleTest.exe), that path is an absolute path in Windows format (C:\\src\\llvm-project\\llvm\\unittests\\DebugInfo\\PDB\\Inputs\\SimpleTest.pdb). On Linux builds, that always fail and, on Windows, we are only luck if we have the same dir layout by chance. Could you fix that by taking relative path and disable that test on non-Windows platforms?
Relanded in 23609331472be22be3be134f6facd8f086c5ea49 so that tests pass on Linux. I changed the PDB searching part to specify the path style, so that it correctly finds the filename on Linux.
Maybe rename Path to ExePath so the code below that replaces the exe filename with the pdb filename is a bit more readable.