Currently, the symbolizer lib can only symbolize a file on disk.
This patch teaches the symbolizer lib to symbolize objects.
llvm-objdump needs this to support archive disassembly with source info.
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
llvm/test/tools/llvm-objdump/X86/disassemble-archive-with-source.ll | ||
---|---|---|
3 ↗ | (On Diff #205452) | debuginfo is usually two words (i.e. debug info) or better yet, call it debug data, to avoid confusion with the .debug_info DWARF section. |
So I'm happy with this approach, but I'd like somebody else to confirm that they are too.
I am probably not the right person to accept this, so minor suggestion is below.
llvm/lib/DebugInfo/Symbolize/Symbolize.cpp | ||
---|---|---|
92 ↗ | (On Diff #205678) | I'd simplify. const auto &I = Modules.find(ModuleName); if (I != Modules.end()) return symbolizeCodeCommon(I->second.get(), ModuleOffset); Expected<SymbolizableModule *> InfoOrErr = createModuleInfo(&Obj, std::move(Context), ModuleName); if (!InfoOrErr) return InfoOrErr.takeError(); return symbolizeCodeCommon(InfoOrErr.get(), ModuleOffset); |
llvm/lib/DebugInfo/Symbolize/Symbolize.cpp | ||
---|---|---|
92 ↗ | (On Diff #205678) | Let's avoid the lifetime extension: auto I = Modules.find(ModuleName); const auto &I is a const reference to a temporary iterator. It is odd. |
llvm/lib/DebugInfo/Symbolize/Symbolize.cpp | ||
---|---|---|
403 ↗ | (On Diff #205678) | I think we can split getOrCreateModuleInfo into two halves. The latter half can be reused by this part of code in the new symbolizeCode: std::unique_ptr<DIContext> Context = DWARFContext::create(Obj, nullptr, DWARFContext::defaultErrorHandler); /////////// codeview is not supported Expected<SymbolizableModule *> InfoOrErr = createModuleInfo(&Obj, std::move(Context), ModuleName); if (!InfoOrErr) return InfoOrErr.takeError(); Info = InfoOrErr.get(); This way, it may make it easy to add codeview support for archive files is someone cares about it. (getOrCreateObjectPair handles .gnu_debuglink, but I think we can forget about that for an archive member.. |
470 ↗ | (On Diff #205678) | Nonnullness of Context is obvious. |
llvm/lib/DebugInfo/Symbolize/Symbolize.cpp | ||
---|---|---|
82 ↗ | (On Diff #206255) | This TODO seems somewhat arbitrary in this location. Is CodeView supported for other parts of LLVM symbolizer? What would happen if somebody tried to use this bit of code with CodeView, compared with old behaviour? |
llvm/lib/DebugInfo/Symbolize/Symbolize.cpp | ||
---|---|---|
82 ↗ | (On Diff #206255) | I'm not entirely sure. I know PDB and CodeView are not exactly the same thing. Looks like the name is used interchangeably in this context? COFFObjectFile::getDebugPDBInfo said it is COFF::IMAGE_DEBUG_TYPE_CODEVIEW. If somebody uses this code with CodeView, loadDataForEXE would fail because of Objects.first->getFileName(), symbolizeCode would return error. Old behavior of symbolizeCode(const ObjectFile &Obj, ..) would always create a DWARFContext. // If this is a COFF object containing PDB info, use a PDBContext to // symbolize. Otherwise, use DWARF. if (auto CoffObject = dyn_cast<COFFObjectFile>(Objects.first)) { const codeview::DebugInfo *DebugInfo; StringRef PDBFileName; auto EC = CoffObject->getDebugPDBInfo(DebugInfo, PDBFileName); if (!EC && DebugInfo != nullptr && !PDBFileName.empty()) { using namespace pdb; std::unique_ptr<IPDBSession> Session; if (auto Err = loadDataForEXE(PDB_ReaderType::DIA, Objects.first->getFileName(), Session)) { Modules.insert( std::make_pair(ModuleName, std::unique_ptr<SymbolizableModule>())); // Return along the PDB filename to provide more context return createFileError(PDBFileName, std::move(Err)); } Context.reset(new PDBContext(*CoffObject, std::move(Session))); } } |
llvm/lib/DebugInfo/Symbolize/Symbolize.cpp | ||
---|---|---|
82 ↗ | (On Diff #206255) | I'm not comfortable with a loss in functionality versus the previous version, unless the old functionality was broken and wouldn't work anyway. Not knowing enough about how CodeView/PDB work, I'm not sure I can comment much further. |
I don't think it's a loss of functionality versus the previous version which creates DWARFContext even for COFF object (not work obviously). This version still does not work for COFF archive but looks a little bit more reasonable. IMHO, we should have no or all support for COFF archive, not partial support. I could either
- Land the previous version (with the additional check for COFF so users are notified of the lack of support) and revisit COFF archive support later.
- Support COFF archive file in this patch with the necessary tests.
I would prefer the first choice since one patch for each makes more sense. @MaskRay @jhenderson thoughts?
llvm/lib/DebugInfo/Symbolize/Symbolize.cpp | ||
---|---|---|
82 ↗ | (On Diff #206255) | This todo (archive not supported) is fine, as long as the normal code path LLVMSymbolizer::symbolizeCode(const std::string &ModuleName, object::SectionedAddress ModuleOffset) below still works. |
93 ↗ | (On Diff #206255) | Should make sure test/tools/llvm-symbolizer/pdb/ still works. |
If we don't support COFF archives already at all (i.e. they don't work under any situation), then I don't think there's a problem that needs solving here, but we should definitely not make things worse. If archives containing COFF with PDBs work as well as before, then I think this is fine. If they no longer work, that's a problem and you should go back to the old behaviour. Beyond that, I don't think you need to fix the world in this patch. You've fixed the problem for DWARF-based debug information, and further improvements can be separate patches. Perhaps best would be to file a separate bug to track the additional cases that are unfixed by this?
I'm pretty sure the PDB symbolization context assumes it has a PE image (EXE or DLL), which it can use to extract the moral equivalent of .gnu_debuglink to find the PDB. That's all handled internally.
Hypothetically, we could change things to read the debug info sections from plain object files before they have been linked, but it's difficult and probably not worth the effort.
My comment on your first version of this patch is about the potential code duplication. Now the codeview concern of the code duplication has been addressed.
The current version doesn't look bad to me. Do you think the initial version is better?
The expected client for this new symbolizeCode interface is for archive members and COFF is unlikely to symbolize plain object files for the moment.
I think the concern here the code movement for removing the potential duplication puts some untested code in the path for DWARF. I assume if there is a need for the COFF archive member symbolization we could easily deduplicate the code in the patch for COFF?
I think you meant "puts some untested code in the path for COFF". That is fine to me.
llvm/lib/DebugInfo/Symbolize/Symbolize.cpp | ||
---|---|---|
431 ↗ | (On Diff #206255) | You can place this block before createModuleInfo to minimize the diff. |
llvm/lib/DebugInfo/Symbolize/Symbolize.cpp | ||
---|---|---|
429 ↗ | (On Diff #207601) | Nit: std::error_code |