This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
include/lldb/
-
lldb/
-
Symbol/
-
SymbolFile.h
-
SymbolVendor.h
1/1
lldb-forward.h
-
source/
-
Plugins/SymbolFile/PDB/
-
SymbolFile/
-
PDB/
1/1
SymbolFilePDB.h
3/7
SymbolFilePDB.cpp
-
Symbol/
-
SymbolVendor.cpp
-
unittests/SymbolFile/PDB/
-
SymbolFile/
-
PDB/
-
SymbolFilePDBTests.cpp

Differential D53368

[Symbol] Search symbols with name and type in a symbol file
ClosedPublic

Authored by aleksandr.urakov on Oct 17 2018, 5:11 AM.

Download Raw Diff

Details

Reviewers

zturner
asmith
labath
clayborg
• espindola

Commits

rG8cfb12b9bd04: [Symbol] Search symbols with name and type in a symbol file
rG15da7684db33: [Symbol] Search symbols with name and type in a symbol file
rLLDB347960: [Symbol] Search symbols with name and type in a symbol file
rL347960: [Symbol] Search symbols with name and type in a symbol file
rL345957: [Symbol] Search symbols with name and type in a symbol file
rLLDB345957: [Symbol] Search symbols with name and type in a symbol file

Summary

This patch adds possibility of searching a public symbol with name and type in a symbol file, not only in a symtab. It is helpful when working with PE, because PE's symtabs contain only imported / exported symbols only. Such a search is required for e.g. evaluation of an expression that calls some function of the debuggee.

A few weeks ago on lldb-dev there was a discussion of this, it is called Symtab for PECOFF.

Diff Detail

Event Timeline

aleksandr.urakov created this revision.Oct 17 2018, 5:11 AM

Herald added a subscriber: lldb-commits. · View Herald TranscriptOct 17 2018, 5:11 AM

This seems like the sort of thing Greg should have a look at as well.

All symbol tables are currently extracted from the object files via ObjectFile::GetSymtab(). Are symbols only in the PDB file? If so I would vote to add a "virtual void SymbolVendor::AddSymbols(Symtab *symtab)" and a "virtual void SymbolFile::AddSymbols(Symtab *symtab)" where we take the symbol table that comes from the object file and we can add symbols to it if the symbol file has symbols it wants to add to the object file's symbol table. All symbol queries go through the lldb_private::Symtab class anyway. Care must be taken to watch out for symbols that might already exist from an ObjectFile's symbol table to ensure don't have duplicates.

So I would:

Add "virtual void SymbolVendor::AddSymbols(Symtab *symtab);" to SymbolVendor that just calls through to its SymbolFile to do the work
Add "virtual void SymbolFile::AddSymbols(Symtab *symtab)" to SymbolFile with default implementation that does nothing
Override SymbolFile::AddSymbols() for SymbolFilePDB and add symbols to the provided symbol table
Modify *SymbolVendor::GetSymtab()" to get the object file symbol table, then pass that along to any symbol file instances it owns to allow each symbol file to augment the symbol table
Remove all "FindPublicSymbols()" code from patch
Revert all symbol searching code to just use the Symtab class now that it contains all needed symbols

This revision now requires changes to proceed.Oct 22 2018, 10:44 AM

To answer your question, PE/COFF executable symbol tables are basically
empty

Ok, I'll reimplement this, thanks!

I've implemented things as Greg have said, except for a few moments:

haven't added AddSymbols to SymbolVendor because it is not used anywhere, and we can just use SymbolFile::AddSymbols directly inside of SymbolVendor;
have made AddSymbols' parameter as a reference instead of a pointer, because there is no reason to send null to this function;
have cached symtab in SymbolVendor to not pass it every time through the SymbolFile::AddSymbols.

But if you don't agree with some of these I'm ready to discuss.

Very close. Just a bit of cleanup now that we changed SymbolFilePDB where we don't seem to need m_public_symbols anymore. See inlined comments and let me know what you think

include/lldb/lldb-forward.h
444	Do we need this anymore? See inlined comments below.
source/Plugins/SymbolFile/PDB/SymbolFilePDB.cpp
1344–1346	Maybe get the file address from "pub_symbol" and avoid converting to a SymbolSP that we will never use? See more comments below.
1353	Just make a local symbol and convert it from PDB to lldb_private::Symbol here? Something like: symtab.AddSymbol(ConvertPDBSymbol(pdb_symbol)); So it seems we shouldn't need the m_public_symbols storage in this class anymore since "symtab" will now own the symbol we just created.
1968–1969	Maybe convert this to: lldb_private::Symbol ConvertPDBSymbol(const llvm::pdb::PDBSymbolPublicSymbol &pub_symbol) And we only call this if we need to create a symbol we will add to the "symtab" in SymbolFilePDB::AddSymbols(...)
source/Plugins/SymbolFile/PDB/SymbolFilePDB.h
247	Do we need this mapping anymore? We should just add the symbols to the symbol table during SymbolFilePDB::AddSymbols(...).

This revision now requires changes to proceed.Oct 24 2018, 10:34 AM

Ah, yes, sure! It's my mistake. I didn't pay attention to the fact that a symtab owns symbols. I'll update the patch, thanks!

source/Plugins/SymbolFile/PDB/SymbolFilePDB.cpp
1344–1346	Unfortunately there is no method of `PDBSymbolPublicSymbol` which allows to retrieve the file address directly. I'll calculate it as section + offset instead.
1353	The problem here is that `ConvertPDBSymbol` can fail e.g. if somehow `pub_symbol` will contain an invalid section number. We can: change the interface of the function to signal about errors; just assume that the section number is always valid (because we already retrieved it before during the file address calculation). In both cases we will retrieve the section twice. We also can: just pass already calculated section and offset to `ConvertPDBSymbol`. But it leads to a blurred responsibility and a weird interface. So I think that it would be better just create a symbol right here - it seems that it doesn't make the code more complex. What do you think about it?
1968–1969	See the comment above

aleksandr.urakov updated this revision to Diff 171056.Oct 25 2018, 3:14 AM

aleksandr.urakov marked 3 inline comments as done.

Very close, just down to making the SymbolVendor::GetSymtab() call symtab.CalculateSymbolSizes() and symtab.Finalize().

source/Plugins/SymbolFile/PDB/SymbolFilePDB.cpp
1382–1383	Seems like these two lines should be done in the symbol vendor? Maybe this function should return the number of symbols added and the symbol vendor could see if AddSymbols returns a positive number, and if so, call symtab.CalculateSymbolSizes() and symtab.Finalize(). We should also see who else is calling these and remove any calls and only do it in the SymbolVendor one time.

This revision now requires changes to proceed.Oct 25 2018, 10:42 AM

Yes, I'll implement it tomorrow (I'm already OOO now), thanks. But is it really necessary to check the number of symbols added if we must to calculate / finalize the symtab after getting it from object file anyway? May be just always do it after creation and processing by the symbol file? For each symtab it will be done just once because of caching.

In D53368#1276152, @aleksandr.urakov wrote:

Yes, I'll implement it tomorrow (I'm already OOO now), thanks. But is it really necessary to check the number of symbols added if we must to calculate / finalize the symtab after getting it from object file anyway? May be just always do it after creation and processing by the symbol file? For each symtab it will be done just once because of caching.

yes, fine to still have void and always call symtab.CalculateSymbolSizes(); and symtab.Finalize() only in the symbol vendor. Find the other places this is called in the ObjectFile plug-ins and remove them and do them once in Symbol vendor when we fetch the symtab for the first time

That's done!

Herald added a reviewer: • espindola. · View Herald TranscriptOct 26 2018, 2:04 AM

Herald added subscribers: arichardson, emaste. · View Herald Transcript

aleksandr.urakov added a child revision: D53759: [PDB] Support PDB-backed expressions evaluation.Oct 26 2018, 6:12 AM

Ping! Can you look at this, please?

Looks good. Thanks for making the changes.

This revision is now accepted and ready to land.Nov 1 2018, 10:23 AM

Thank you!

Closed by commit rLLDB345957: [Symbol] Search symbols with name and type in a symbol file (authored by aleksandr.urakov). · Explain WhyNov 2 2018, 1:58 AM

This revision was automatically updated to reflect the committed changes.

This broke MacOS. I'm going to revert this. To reproduce, just run ninja check-lldb with your patches.
Please let me know if you need other informations.

Thanks for catching that! Unfortunately, I have no access to MacOS, can you provide some more info about failure, please?

In D53368#1286361, @aleksandr.urakov wrote:

Thanks for catching that! Unfortunately, I have no access to MacOS, can you provide some more info about failure, please?

Unfortunately the bot logs are gone. When I originally looked at them they weren't particularly informative, so I'm afraid the only way would be that of trying to reproduce this thing on a real machine.

I'm not sure, but have an assumption. Here is the first green build of the green-dragon-24: http://green.lab.llvm.org/green/job/lldb-cmake/12090/ It became green after three changes, one of them is your revert of my commit, while another is Zachary's "Fix the lit test suite". I think that it's Zachary's commit fixed the build, not the revert. Moreover, my commit is Windows specific, I can't figure out, how it can break the MacOS build... So may be we will recommit it back? If it will still fail, then we could take failure logs and revert it back again.

@davide You are right, this patch was the cause of the failure, sorry for that. It seems that I've found a generic issue with this patch. Thanks again for pointing to that!

@clayborg The problem is that there is a bunch of places where a symtab is retrieved directly from an object file, not from a symbol vendor. So it remains uncalculated. If we will just return the recalculation / finalization to object files, it will fix the issue, but symbols from PDB will not be available in this places. We can try to use the symbol vendor instead everywhere in this places (we can retrieve a module from an object file, and we can retrieve a symbol vendor from a module, so it is guaranteed that we can get the symbol vendor in all these places). What do you think about the such approach? What pitfalls can be with it?

This revision is now accepted and ready to land.Nov 3 2018, 2:21 PM

So it depends on what code was retrieving the symbol table from the object file. Can you detail where this was happening?

Yes, sure. It happens in the following functions:

Module::ResolveSymbolContextForAddress
DynamicLoaderHexagonDYLD::SetRendezvousBreakpoint through findSymbolAddress
DynamicLoaderHexagonDYLD::RendezvousBreakpointHit through findSymbolAddress
JITLoaderGDB::ReadJITDescriptorImpl
ObjectFileELF::GetAddressClass
ObjectFileELF::GetSymtab
ObjectFileELF::RelocateSection
ObjectFileELF::Dump
ObjectFileMachO::GetAddressClass
ObjectFileMachO::ProcessSegmentCommand
SymbolFileDWARF::GetObjCClassSymbol
SymbolFileDWARF::ParseVariableDIE
SymbolFileDWARFDebugMap::CompileUnitInfo::GetFileRangeMap
SymbolFileDWARFDebugMap::InitOSO
SymbolFileDWARFDebugMap::ResolveSymbolContext
SymbolFileDWARFDebugMap::FindCompleteObjCDefinitionTypeForDIE
SymbolFileSymtab::CalculateAbilities
SymbolFileSymtab::ParseCompileUnitAtIndex
SymbolFileSymtab::ParseCompileUnitFunctions
SymbolFileSymtab::ResolveSymbolContext
ObjectFile::GetAddressClass
SymbolVendor::GetSymtab()

Update the patch, move symtab finalization back to object files.

This patch makes object files and symbol files (in the case if they add symbols in a symtab) to be responsible for finalization of a symtab. It's because a symtab is used in a bunch of places, where it's undesirable to retrieve one through a symbol vendor. For example, when the object file itself uses its symtab, we can't retrieve it from the symbol vendor, because the symbol vendor is implemented in the terms of an object file, so such a solution will introduce a circular dependency, which is undesirable.

But on the other hand, if the object file uses its own symtab, then it likely doesn't rely on presence of symbols from the symbol file in that symtab. The only things it requires are symbols from the object file and finalization of the symtab.

So after this update we have the following guarantees:

if a symtab is retrieved from an object file, then it's consistent and guaranteed contains symbols from the object file. It may (or may not) also contain symbols from a symbol file;
if a symtab is retrieved from a symbol vendor, then it's consistent and guaranteed contains symbols from an object file and a symbol file.

I've taken a look at the places, where the symtab is retrieved from an object file, and it seems that the only place we need to fix due to that guarantees is the preventive usage of the symbol vendor in the Address::GetAddressClass function.

The disadvantages of the current solution are:

when symbols are added in the symtab both from an object file and a symbol file, the symtab is finalized twice;
the symtabs retrieved from different places have different guarantees.

But to solve these we need to make some other more higher-level entity (besides the object file) to own the symtab (e.g. symbol vendor) and to rewrite all the related things in object files and symbol files. The problem is that it's not trivial to make it and not to break a lot of current code.

What do you think about this approach?

aleksandr.urakov requested review of this revision.Nov 21 2018, 7:35 AM

Ping! Can you take a look, please?

clayborg accepted this revision.Nov 29 2018, 7:06 AM

This revision is now accepted and ready to land.Nov 29 2018, 7:06 AM

I've recently started looking at adding a new symbol file format (breakpad symbols). While researching the best way to achieve that, I started comparing the operation of PDB and DWARF symbol files. I noticed a very important difference there, and I think that is the cause of our problems here. In the DWARF implementation, a symbol file is an overlay on top of an object file - it takes the data contained by the object file and presents it in a more structured way.

However, that is not the case with PDB (both implementations). These take the debug information from a completely different file, which is not backed by an ObjectFile instance, and then present that. Since the SymbolFile interface requires them to be backed by an object file, they both pretend they are backed by the original EXE file, but in reality the data comes from elsewhere.

If we had an ObjectFilePDB (which not also not ideal, though in a way it is a better fit to the current lldb organization), then this could expose the PDB symtab via the existing ObjectFile interface and we could reuse the existing mechanism for merging symtabs from two object files.

I am asking this because now I am facing a choice in how to implement breakpad symbols. I could go the PDB way, and read the symbols without an intervening object file, or I could create an ObjectFileBreakpad and then (possibly) a SymbolFileBreakpad sitting on top of that.

The drawbacks of the PDB approach I see are:

I lose the ability to do matching of the (real) object file via symbol vendors. The PDB symbol files now basically implement their own little symbol vendors inside them, which is mostly fine if you just need to find the PDB next to the exe file. However, things could get a bit messy if you wanted to implement some more complex searching on multiple paths, or downloading them from the internet.
I'll hit issues when attempting to unwind (which is the real meat of the breakpad symbols), because unwind info is currently provided via the ObjectFile interface (ObjectFile::GetUnwindTable).

The drawbacks of the ObjectFile approach are:

more code - it needs a new ObjectFile and a new SymbolFile class (possibly also a SymbolVendor)
it will probably look a bit weird because Breakpad files (and PDBs) aren't really object files

I'd like to hear your thoughts on this, if you have any.

In D53368#1313124, @labath wrote:

I've recently started looking at adding a new symbol file format (breakpad symbols). While researching the best way to achieve that, I started comparing the operation of PDB and DWARF symbol files. I noticed a very important difference there, and I think that is the cause of our problems here. In the DWARF implementation, a symbol file is an overlay on top of an object file - it takes the data contained by the object file and presents it in a more structured way.

However, that is not the case with PDB (both implementations). These take the debug information from a completely different file, which is not backed by an ObjectFile instance, and then present that. Since the SymbolFile interface requires them to be backed by an object file, they both pretend they are backed by the original EXE file, but in reality the data comes from elsewhere.

Don't DWARF DWP files work this way as well? How is support for this implemented in LLDB?

If we had an ObjectFilePDB (which not also not ideal, though in a way it is a better fit to the current lldb organization), then this could expose the PDB symtab via the existing ObjectFile interface and we could reuse the existing mechanism for merging symtabs from two object files.

I am asking this because now I am facing a choice in how to implement breakpad symbols. I could go the PDB way, and read the symbols without an intervening object file, or I could create an ObjectFileBreakpad and then (possibly) a SymbolFileBreakpad sitting on top of that.

What if SymbolFile interface provided a new method such as GetSymtab() while ObjectFile provides a method called HasExternalSymtab(). When you call ObjectFilePECOFF::GetSymtab(), it could first check if HasExternalSymtab() is true, and if so it could call the SymbolFile plugin and return that

In D53368#1313145, @zturner wrote:

In D53368#1313124, @labath wrote:

I've recently started looking at adding a new symbol file format (breakpad symbols). While researching the best way to achieve that, I started comparing the operation of PDB and DWARF symbol files. I noticed a very important difference there, and I think that is the cause of our problems here. In the DWARF implementation, a symbol file is an overlay on top of an object file - it takes the data contained by the object file and presents it in a more structured way.

However, that is not the case with PDB (both implementations). These take the debug information from a completely different file, which is not backed by an ObjectFile instance, and then present that. Since the SymbolFile interface requires them to be backed by an object file, they both pretend they are backed by the original EXE file, but in reality the data comes from elsewhere.

Don't DWARF DWP files work this way as well? How is support for this implemented in LLDB?

There are some similarities, but DWP is a bit different. The main difference is that the DWP file is still an ELF (or whatever) file, so we still have a ObjectFile sitting below the symbol file. The other difference is that in case of DWP we still have a significant chunk of debug information present in the main executable (mainly various offsets that need to be applied to the unlinked debug info in the dwo/dwp files), so you can still very well say that the symbol file is reading information from the main executable. What DWARF does in this case is it creates a main SymbolFileDWARF for reading data from the main object file, and then a bunch of inner SymbolFileDWARFDwo/Dwp instances which read data from the other files. There are plenty of things to not like here as well, but at least this maintains the property that each symbol file sits on top of the object file from which it reads the data from. (and symtab doesn't go into the dwp file, so there are no issues with that).

I am asking this because now I am facing a choice in how to implement breakpad symbols. I could go the PDB way, and read the symbols without an intervening object file, or I could create an ObjectFileBreakpad and then (possibly) a SymbolFileBreakpad sitting on top of that.

What if SymbolFile interface provided a new method such as GetSymtab() while ObjectFile provides a method called HasExternalSymtab(). When you call ObjectFilePECOFF::GetSymtab(), it could first check if HasExternalSymtab() is true, and if so it could call the SymbolFile plugin and return that

I don't think this would be good because there's no way for the PECOFF file to know if we will have a PDB file on top of it. If we don't find the pdb file, then the best we can do is use the list of external symbols as the symtab for the PECOFF file. I think a better way would ask the SymbolFile for the symtab. Then the symbol file can either return it's own symtab, or just forward the symtab from the object file (we already have a SymbolFileSymtab for cases when we have no debug info). That is more-or-less what this patch is doing, except that here the SymbolFile is inserting it's own symbols into the symtab created by the object file.

Great observations Pavel! I think it's really important to have
orthogonal/composable abstractions here: the symbols should be decoupled
from the container format IMO.

You know more about the ObjectFile than me so I can't say if
ObjectFileBreakpad is the best interface, but here are my initial
observations (in a somewhat random order):

Even though it doesn't sound like that, ironically, Breakpad might be now better of as an ObjectFile rather than a SymbolFile. The main three pieces of information contained in breakpad files are:

list of symbols
unwind information
line tables

Of these, the first two are presently vended by object files, and only the line table is done by symbol files. The auxiliary pieces of information in the breakpad files (architecture, OS, UUID), are also a property of object files in lldb.

We need clear and separate abstractions for a container (ELF, PE

file, Breakpad symbols) vs. the content (debug Information).

I agree, unfortunately I think we're quite far from that now. This is complicated by the fact that different "symbol file" formats have different notion of what "symbols" are. E.g. it's obvious that Symtab ended up being vended by the object files because that's how elf+macho/dwarf do things, but that's not the case for pecoff/pdb.

We need to be able to consume symbols when the corresponding module

binary is not available. This is common for postmortem debugging (ex.
having a minidump + PDBs, but not all the .DLLs or EXE files).

This is also going to be a bit tricky, though slightly orthogonal requirement. Right now things generally assume that a Module has an ObjectFile to read the data from, even though ProcessMinidump tries to work around that with special kinds of Modules. That might be enough to make addresses resolve somewhat reasonably, but I'm not sure what will happen once we start providing symbols for those kinds of modules.

I'm not a fan of the PDB model, where the symbols are searched in

both the symtabs then in the symbol files. I'd rather like to see the
symtab an interface for symbols regardless of where they come from.
(Zach expressed some efficiency concerns if we'd need to build a
symtab from a PDB for example as opposed to accessing the PDB symfile
directly, although I think we can design to address this - ie. multiple
concrete symtab implementations, some of which are *internally* aware of
the source container, but w/o leaking this through the interface)

I am afraid I am a bit lost here, as I don't know much about PDBs. I'll have to study this in more detail.

The symbol vendor observation is very important. Right now LLDB has

basic support for looking up DWARF symbols and none for PDBs. (for
example, IMO LLDB could greatly benefit from something like Microsoft's
symsrv - I'm actually planning to look into it soon)
(Whatever we do, this should be one of the key requirements IMO)

agreed.

On 29/11/2018 21:29, Leonard Mosescu wrote:

Hi Aleksandr, yes, no objections to this patch.

I was responding to Pavel's comments, which I also assume are
forward-looking as well, not strictly related to this patch.

Agreed, and I apologise for hijacking your review (I seem to be getting in the habit of that). I initially thought that having a ObjectFilePDB might mean that this patch would not be needed, but now I am slowly growing to like it. I think it makes sense for a symbol file to be able to provide additional symbols regardless of whether this could be done differently too.

In D53368#1313238, @labath wrote:

In D53368#1313145, @zturner wrote:

In D53368#1313124, @labath wrote:

I've recently started looking at adding a new symbol file format (breakpad symbols). While researching the best way to achieve that, I started comparing the operation of PDB and DWARF symbol files. I noticed a very important difference there, and I think that is the cause of our problems here. In the DWARF implementation, a symbol file is an overlay on top of an object file - it takes the data contained by the object file and presents it in a more structured way.

However, that is not the case with PDB (both implementations). These take the debug information from a completely different file, which is not backed by an ObjectFile instance, and then present that. Since the SymbolFile interface requires them to be backed by an object file, they both pretend they are backed by the original EXE file, but in reality the data comes from elsewhere.

Don't DWARF DWP files work this way as well? How is support for this implemented in LLDB?

There are some similarities, but DWP is a bit different. The main difference is that the DWP file is still an ELF (or whatever) file, so we still have a ObjectFile sitting below the symbol file. The other difference is that in case of DWP we still have a significant chunk of debug information present in the main executable (mainly various offsets that need to be applied to the unlinked debug info in the dwo/dwp files), so you can still very well say that the symbol file is reading information from the main executable. What DWARF does in this case is it creates a main SymbolFileDWARF for reading data from the main object file, and then a bunch of inner SymbolFileDWARFDwo/Dwp instances which read data from the other files. There are plenty of things to not like here as well, but at least this maintains the property that each symbol file sits on top of the object file from which it reads the data from. (and symtab doesn't go into the dwp file, so there are no issues with that).

I am asking this because now I am facing a choice in how to implement breakpad symbols. I could go the PDB way, and read the symbols without an intervening object file, or I could create an ObjectFileBreakpad and then (possibly) a SymbolFileBreakpad sitting on top of that.

What if SymbolFile interface provided a new method such as GetSymtab() while ObjectFile provides a method called HasExternalSymtab(). When you call ObjectFilePECOFF::GetSymtab(), it could first check if HasExternalSymtab() is true, and if so it could call the SymbolFile plugin and return that

I don't think this would be good because there's no way for the PECOFF file to know if we will have a PDB file on top of it.

I'm actually starting to wonder even if GetSymtab() should be part of ObjectFile. The first thing it does is get the Module and then start calling a bunch of stuff on the Module interface. Perhaps the place to start is comparing the Module and ObjectFile interfaces and seeing if the existing APIs make the most sense being moved up to Module. If everything was on Module then the Module has everything it needs to go to the SymbolVendor and find a PDB file.

Closed by commit rL347960: [Symbol] Search symbols with name and type in a symbol file (authored by aleksandr.urakov). · Explain WhyNov 29 2018, 10:59 PM

This revision was automatically updated to reflect the committed changes.

Herald added a subscriber: llvm-commits. · View Herald TranscriptNov 29 2018, 10:59 PM

Revision Contents

Path

Size

include/

lldb/

Symbol/

SymbolFile.h

2 lines

SymbolVendor.h

2 lines

lldb-forward.h

1 line

source/

Plugins/

SymbolFile/

PDB/

SymbolFilePDB.h

6 lines

SymbolFilePDB.cpp

67 lines

Symbol/

SymbolVendor.cpp

27 lines

unittests/

SymbolFile/

PDB/

SymbolFilePDBTests.cpp

17 lines

Diff 170889

include/lldb/Symbol/SymbolFile.h

Show First 20 Lines • Show All 207 Lines • ▼ Show 20 Lines	public:

ObjectFile *GetObjectFile() { return m_obj_file; }		ObjectFile *GetObjectFile() { return m_obj_file; }
const ObjectFile *GetObjectFile() const { return m_obj_file; }		const ObjectFile *GetObjectFile() const { return m_obj_file; }

virtual std::vector<CallEdge> ParseCallEdgesInFunction(UserID func_id) {		virtual std::vector<CallEdge> ParseCallEdgesInFunction(UserID func_id) {
return {};		return {};
}		}

		virtual void AddSymbols(Symtab &symtab) {}

//------------------------------------------------------------------		//------------------------------------------------------------------
/// Notify the SymbolFile that the file addresses in the Sections		/// Notify the SymbolFile that the file addresses in the Sections
/// for this module have been changed.		/// for this module have been changed.
//------------------------------------------------------------------		//------------------------------------------------------------------
virtual void SectionFileAddressesChanged() {}		virtual void SectionFileAddressesChanged() {}

virtual void Dump(Stream &s) {}		virtual void Dump(Stream &s) {}

Show All 14 Lines

include/lldb/Symbol/SymbolVendor.h

Show First 20 Lines • Show All 158 Lines • ▼ Show 20 Lines	protected:
TypeList m_type_list; // Uniqued types for all parsers owned by this module		TypeList m_type_list; // Uniqued types for all parsers owned by this module
CompileUnits m_compile_units; // The current compile units		CompileUnits m_compile_units; // The current compile units
lldb::ObjectFileSP m_objfile_sp; // Keep a reference to the object file in		lldb::ObjectFileSP m_objfile_sp; // Keep a reference to the object file in
// case it isn't the same as the module		// case it isn't the same as the module
// object file (debug symbols in a separate		// object file (debug symbols in a separate
// file)		// file)
std::unique_ptr<SymbolFile> m_sym_file_ap; // A single symbol file. Subclasses		std::unique_ptr<SymbolFile> m_sym_file_ap; // A single symbol file. Subclasses
// can add more of these if needed.		// can add more of these if needed.
		Symtab *m_symtab; // Save a symtab once to not pass it through `AddSymbols` of
		// the symbol file each time when it is needed

private:		private:
//------------------------------------------------------------------		//------------------------------------------------------------------
// For SymbolVendor only		// For SymbolVendor only
//------------------------------------------------------------------		//------------------------------------------------------------------
DISALLOW_COPY_AND_ASSIGN(SymbolVendor);		DISALLOW_COPY_AND_ASSIGN(SymbolVendor);
};		};

} // namespace lldb_private		} // namespace lldb_private

#endif // liblldb_SymbolVendor_h_		#endif // liblldb_SymbolVendor_h_

include/lldb/lldb-forward.h

	Show First 20 Lines • Show All 435 Lines • ▼ Show 20 Lines
	typedef std::shared_ptr<lldb_private::StreamFile> StreamFileSP;			typedef std::shared_ptr<lldb_private::StreamFile> StreamFileSP;
	typedef std::shared_ptr<lldb_private::StringSummaryFormat>			typedef std::shared_ptr<lldb_private::StringSummaryFormat>
	StringTypeSummaryImplSP;			StringTypeSummaryImplSP;
	typedef std::unique_ptr<lldb_private::StructuredDataImpl> StructuredDataImplUP;			typedef std::unique_ptr<lldb_private::StructuredDataImpl> StructuredDataImplUP;
	typedef std::shared_ptr<lldb_private::StructuredDataPlugin>			typedef std::shared_ptr<lldb_private::StructuredDataPlugin>
	StructuredDataPluginSP;			StructuredDataPluginSP;
	typedef std::weak_ptr<lldb_private::StructuredDataPlugin>			typedef std::weak_ptr<lldb_private::StructuredDataPlugin>
	StructuredDataPluginWP;			StructuredDataPluginWP;
				typedef std::shared_ptr<lldb_private::Symbol> SymbolSP;
				clayborgUnsubmitted Done Reply Inline Actions Do we need this anymore? See inlined comments below. clayborg: Do we need this anymore? See inlined comments below.
	typedef std::shared_ptr<lldb_private::SymbolFile> SymbolFileSP;			typedef std::shared_ptr<lldb_private::SymbolFile> SymbolFileSP;
	typedef std::shared_ptr<lldb_private::SymbolFileType> SymbolFileTypeSP;			typedef std::shared_ptr<lldb_private::SymbolFileType> SymbolFileTypeSP;
	typedef std::weak_ptr<lldb_private::SymbolFileType> SymbolFileTypeWP;			typedef std::weak_ptr<lldb_private::SymbolFileType> SymbolFileTypeWP;
	typedef std::shared_ptr<lldb_private::SymbolContextSpecifier>			typedef std::shared_ptr<lldb_private::SymbolContextSpecifier>
	SymbolContextSpecifierSP;			SymbolContextSpecifierSP;
	typedef std::unique_ptr<lldb_private::SymbolVendor> SymbolVendorUP;			typedef std::unique_ptr<lldb_private::SymbolVendor> SymbolVendorUP;
	typedef std::shared_ptr<lldb_private::SyntheticChildren> SyntheticChildrenSP;			typedef std::shared_ptr<lldb_private::SyntheticChildren> SyntheticChildrenSP;
	typedef std::shared_ptr<lldb_private::SyntheticChildrenFrontEnd>			typedef std::shared_ptr<lldb_private::SyntheticChildrenFrontEnd>
	▲ Show 20 Lines • Show All 56 Lines • Show Last 20 Lines

source/Plugins/SymbolFile/PDB/SymbolFilePDB.h

Show First 20 Lines • Show All 126 Lines • ▼ Show 20 Lines	public:
uint32_t FindFunctions(const lldb_private::RegularExpression &regex,		uint32_t FindFunctions(const lldb_private::RegularExpression &regex,
bool include_inlines, bool append,		bool include_inlines, bool append,
lldb_private::SymbolContextList &sc_list) override;		lldb_private::SymbolContextList &sc_list) override;

void GetMangledNamesForFunction(		void GetMangledNamesForFunction(
const std::string &scope_qualified_name,		const std::string &scope_qualified_name,
std::vector<lldb_private::ConstString> &mangled_names) override;		std::vector<lldb_private::ConstString> &mangled_names) override;

		void AddSymbols(lldb_private::Symtab &symtab) override;

uint32_t		uint32_t
FindTypes(const lldb_private::SymbolContext &sc,		FindTypes(const lldb_private::SymbolContext &sc,
const lldb_private::ConstString &name,		const lldb_private::ConstString &name,
const lldb_private::CompilerDeclContext *parent_decl_ctx,		const lldb_private::CompilerDeclContext *parent_decl_ctx,
bool append, uint32_t max_matches,		bool append, uint32_t max_matches,
llvm::DenseSet<lldb_private::SymbolFile *> &searched_symbol_files,		llvm::DenseSet<lldb_private::SymbolFile *> &searched_symbol_files,
lldb_private::TypeMap &types) override;		lldb_private::TypeMap &types) override;

▲ Show 20 Lines • Show All 88 Lines • ▼ Show 20 Lines	private:

void CacheFunctionNames();		void CacheFunctionNames();

bool DeclContextMatchesThisSymbolFile(		bool DeclContextMatchesThisSymbolFile(
const lldb_private::CompilerDeclContext *decl_ctx);		const lldb_private::CompilerDeclContext *decl_ctx);

uint32_t GetCompilandId(const llvm::pdb::PDBSymbolData &data);		uint32_t GetCompilandId(const llvm::pdb::PDBSymbolData &data);

		lldb_private::Symbol *
		GetPublicSymbol(const llvm::pdb::PDBSymbolPublicSymbol &pub_symbol);

llvm::DenseMap<uint32_t, lldb::CompUnitSP> m_comp_units;		llvm::DenseMap<uint32_t, lldb::CompUnitSP> m_comp_units;
llvm::DenseMap<uint32_t, lldb::TypeSP> m_types;		llvm::DenseMap<uint32_t, lldb::TypeSP> m_types;
llvm::DenseMap<uint32_t, lldb::VariableSP> m_variables;		llvm::DenseMap<uint32_t, lldb::VariableSP> m_variables;
		llvm::DenseMap<uint32_t, lldb::SymbolSP> m_public_symbols;
		clayborgUnsubmitted Done Reply Inline Actions Do we need this mapping anymore? We should just add the symbols to the symbol table during SymbolFilePDB::AddSymbols(...). clayborg: Do we need this mapping anymore? We should just add the symbols to the symbol table during…
llvm::DenseMap<uint64_t, std::string> m_public_names;		llvm::DenseMap<uint64_t, std::string> m_public_names;

SecContribsMap m_sec_contribs;		SecContribsMap m_sec_contribs;

std::vector<lldb::TypeSP> m_builtin_types;		std::vector<lldb::TypeSP> m_builtin_types;
std::unique_ptr<llvm::pdb::IPDBSession> m_session_up;		std::unique_ptr<llvm::pdb::IPDBSession> m_session_up;
std::unique_ptr<llvm::pdb::PDBSymbolExe> m_global_scope_up;		std::unique_ptr<llvm::pdb::PDBSymbolExe> m_global_scope_up;
uint32_t m_cached_compile_unit_count;		uint32_t m_cached_compile_unit_count;
std::unique_ptr<lldb_private::CompilerDeclContext> m_tu_decl_ctx_up;		std::unique_ptr<lldb_private::CompilerDeclContext> m_tu_decl_ctx_up;

lldb_private::UniqueCStringMap<uint32_t> m_func_full_names;		lldb_private::UniqueCStringMap<uint32_t> m_func_full_names;
lldb_private::UniqueCStringMap<uint32_t> m_func_base_names;		lldb_private::UniqueCStringMap<uint32_t> m_func_base_names;
lldb_private::UniqueCStringMap<uint32_t> m_func_method_names;		lldb_private::UniqueCStringMap<uint32_t> m_func_method_names;
};		};

#endif // lldb_Plugins_SymbolFile_PDB_SymbolFilePDB_h_		#endif // lldb_Plugins_SymbolFile_PDB_SymbolFilePDB_h_

source/Plugins/SymbolFile/PDB/SymbolFilePDB.cpp

Show First 20 Lines • Show All 1,325 Lines • ▼ Show 20 Lines	SymbolFilePDB::FindFunctions(const lldb_private::RegularExpression &regex,

return sc_list.GetSize() - old_size;		return sc_list.GetSize() - old_size;
}		}

void SymbolFilePDB::GetMangledNamesForFunction(		void SymbolFilePDB::GetMangledNamesForFunction(
const std::string &scope_qualified_name,		const std::string &scope_qualified_name,
std::vector<lldb_private::ConstString> &mangled_names) {}		std::vector<lldb_private::ConstString> &mangled_names) {}

		void SymbolFilePDB::AddSymbols(lldb_private::Symtab &symtab) {
		std::set<lldb::addr_t> sym_addresses;
		for (size_t i = 0; i < symtab.GetNumSymbols(); i++)
		sym_addresses.insert(symtab.SymbolAtIndex(i)->GetFileAddress());

		auto results = m_global_scope_up->findAllChildren<PDBSymbolPublicSymbol>();
		if (!results)
		return;

		while (auto pub_symbol = results->getNext()) {
		auto symbol_ptr = GetPublicSymbol(*pub_symbol);
		if (!symbol_ptr)
		continue;
		clayborgUnsubmitted Done Reply Inline Actions Maybe get the file address from "pub_symbol" and avoid converting to a SymbolSP that we will never use? See more comments below. clayborg: Maybe get the file address from "pub_symbol" and avoid converting to a SymbolSP that we will…
		aleksandr.urakovAuthorUnsubmitted Not Done Reply Inline Actions Unfortunately there is no method of `PDBSymbolPublicSymbol` which allows to retrieve the file address directly. I'll calculate it as section + offset instead. aleksandr.urakov: Unfortunately there is no method of `PDBSymbolPublicSymbol` which allows to retrieve the file…

		auto file_addr = symbol_ptr->GetFileAddress();
		if (sym_addresses.find(file_addr) != sym_addresses.end())
		continue;
		sym_addresses.insert(file_addr);

		symtab.AddSymbol(*symbol_ptr);
		clayborgUnsubmitted Done Reply Inline Actions Just make a local symbol and convert it from PDB to lldb_private::Symbol here? Something like: symtab.AddSymbol(ConvertPDBSymbol(pdb_symbol)); So it seems we shouldn't need the m_public_symbols storage in this class anymore since "symtab" will now own the symbol we just created. clayborg: Just make a local symbol and convert it from PDB to lldb_private::Symbol here? Something like…
		aleksandr.urakovAuthorUnsubmitted Not Done Reply Inline Actions The problem here is that `ConvertPDBSymbol` can fail e.g. if somehow `pub_symbol` will contain an invalid section number. We can: change the interface of the function to signal about errors; just assume that the section number is always valid (because we already retrieved it before during the file address calculation). In both cases we will retrieve the section twice. We also can: just pass already calculated section and offset to `ConvertPDBSymbol`. But it leads to a blurred responsibility and a weird interface. So I think that it would be better just create a symbol right here - it seems that it doesn't make the code more complex. What do you think about it? aleksandr.urakov: The problem here is that `ConvertPDBSymbol` can fail e.g. if somehow `pub_symbol` will contain…
		}

		symtab.CalculateSymbolSizes();
		}

uint32_t SymbolFilePDB::FindTypes(		uint32_t SymbolFilePDB::FindTypes(
const lldb_private::SymbolContext &sc,		const lldb_private::SymbolContext &sc,
const lldb_private::ConstString &name,		const lldb_private::ConstString &name,
const lldb_private::CompilerDeclContext *parent_decl_ctx, bool append,		const lldb_private::CompilerDeclContext *parent_decl_ctx, bool append,
uint32_t max_matches,		uint32_t max_matches,
llvm::DenseSet<lldb_private::SymbolFile *> &searched_symbol_files,		llvm::DenseSet<lldb_private::SymbolFile *> &searched_symbol_files,
lldb_private::TypeMap &types) {		lldb_private::TypeMap &types) {
if (!append)		if (!append)
types.Clear();		types.Clear();
if (!name)		if (!name)
return 0;		return 0;
if (!DeclContextMatchesThisSymbolFile(parent_decl_ctx))		if (!DeclContextMatchesThisSymbolFile(parent_decl_ctx))
return 0;		return 0;

searched_symbol_files.clear();		searched_symbol_files.clear();
searched_symbol_files.insert(this);		searched_symbol_files.insert(this);

std::string name_str = name.AsCString();		std::string name_str = name.AsCString();

// There is an assumption 'name' is not a regex		// There is an assumption 'name' is not a regex
FindTypesByName(name_str, parent_decl_ctx, max_matches, types);		FindTypesByName(name_str, parent_decl_ctx, max_matches, types);

return types.GetSize();		return types.GetSize();
}		}

		clayborgUnsubmitted Not Done Reply Inline Actions Seems like these two lines should be done in the symbol vendor? Maybe this function should return the number of symbols added and the symbol vendor could see if AddSymbols returns a positive number, and if so, call symtab.CalculateSymbolSizes() and symtab.Finalize(). We should also see who else is calling these and remove any calls and only do it in the SymbolVendor one time. clayborg: Seems like these two lines should be done in the symbol vendor? Maybe this function should…
void SymbolFilePDB::FindTypesByRegex(		void SymbolFilePDB::FindTypesByRegex(
const lldb_private::RegularExpression &regex, uint32_t max_matches,		const lldb_private::RegularExpression &regex, uint32_t max_matches,
lldb_private::TypeMap &types) {		lldb_private::TypeMap &types) {
// When searching by regex, we need to go out of our way to limit the search		// When searching by regex, we need to go out of our way to limit the search
// space as much as possible since this searches EVERYTHING in the PDB,		// space as much as possible since this searches EVERYTHING in the PDB,
// manually doing regex comparisons. PDB library isn't optimized for regex		// manually doing regex comparisons. PDB library isn't optimized for regex
// searches or searches across multiple symbol types at the same time, so the		// searches or searches across multiple symbol types at the same time, so the
// best we can do is to search enums, then typedefs, then classes one by one,		// best we can do is to search enums, then typedefs, then classes one by one,
▲ Show 20 Lines • Show All 567 Lines • ▼ Show 20 Lines	while (auto LexParent = m_session_up->getSymbolById(LexParentId)) {
if (LexParent->getSymTag() == PDB_SymType::Compiland)		if (LexParent->getSymTag() == PDB_SymType::Compiland)
return LexParentId;		return LexParentId;
LexParentId = LexParent->getRawSymbol().getLexicalParentId();		LexParentId = LexParent->getRawSymbol().getLexicalParentId();
}		}
}		}

return 0;		return 0;
}		}

		lldb_private::Symbol *SymbolFilePDB::GetPublicSymbol(
		const llvm::pdb::PDBSymbolPublicSymbol &pub_symbol) {
		clayborgUnsubmitted Done Reply Inline Actions Maybe convert this to: lldb_private::Symbol ConvertPDBSymbol(const llvm::pdb::PDBSymbolPublicSymbol &pub_symbol) And we only call this if we need to create a symbol we will add to the "symtab" in SymbolFilePDB::AddSymbols(...) clayborg: Maybe convert this to: ``` lldb_private::Symbol ConvertPDBSymbol(const llvm::pdb…
		aleksandr.urakovAuthorUnsubmitted Not Done Reply Inline Actions See the comment above aleksandr.urakov: See the comment above
		auto it = m_public_symbols.find(pub_symbol.getSymIndexId());
		if (it != m_public_symbols.end())
		return it->second.get();

		auto section_list = m_obj_file->GetSectionList();
		if (!section_list)
		return nullptr;

		auto section_idx = pub_symbol.getAddressSection() - 1;
		if (section_idx >= section_list->GetSize())
		return nullptr;

		auto section = section_list->GetSectionAtIndex(section_idx);
		if (!section)
		return nullptr;

		auto size = pub_symbol.getLength();

		auto symbol_sp = std::make_shared<Symbol>(
		pub_symbol.getSymIndexId(), // symID
		pub_symbol.getName().c_str(), // name
		true, // name_is_mangled
		pub_symbol.isCode() ? eSymbolTypeCode : eSymbolTypeData, // type
		true, // external
		false, // is_debug
		false, // is_trampoline
		false, // is_artificial
		section, // section_sp
		pub_symbol.getAddressOffset(), // value
		size, // size
		size != 0, // size_is_valid
		false, // contains_linker_annotations
		0 // flags
		);

		m_public_symbols[pub_symbol.getSymIndexId()] = symbol_sp;

		return symbol_sp.get();
		}

source/Symbol/SymbolVendor.cpp

Show First 20 Lines • Show All 55 Lines • ▼ Show 20 Lines	SymbolVendor *SymbolVendor::FindPlugin(const lldb::ModuleSP &module_sp,
return instance_ap.release();		return instance_ap.release();
}		}

//----------------------------------------------------------------------		//----------------------------------------------------------------------
// SymbolVendor constructor		// SymbolVendor constructor
//----------------------------------------------------------------------		//----------------------------------------------------------------------
SymbolVendor::SymbolVendor(const lldb::ModuleSP &module_sp)		SymbolVendor::SymbolVendor(const lldb::ModuleSP &module_sp)
: ModuleChild(module_sp), m_type_list(), m_compile_units(),		: ModuleChild(module_sp), m_type_list(), m_compile_units(),
m_sym_file_ap() {}		m_sym_file_ap(), m_symtab() {}

//----------------------------------------------------------------------		//----------------------------------------------------------------------
// Destructor		// Destructor
//----------------------------------------------------------------------		//----------------------------------------------------------------------
SymbolVendor::~SymbolVendor() {}		SymbolVendor::~SymbolVendor() {}

//----------------------------------------------------------------------		//----------------------------------------------------------------------
// Add a representation given an object file.		// Add a representation given an object file.
▲ Show 20 Lines • Show All 360 Lines • ▼ Show 20 Lines	if (symfile_objfile)
return symfile_objfile->GetFileSpec();		return symfile_objfile->GetFileSpec();
}		}

return FileSpec();		return FileSpec();
}		}

Symtab *SymbolVendor::GetSymtab() {		Symtab *SymbolVendor::GetSymtab() {
ModuleSP module_sp(GetModule());		ModuleSP module_sp(GetModule());
if (module_sp) {		if (!module_sp)
		return nullptr;

		std::lock_guard<std::recursive_mutex> guard(module_sp->GetMutex());

		if (m_symtab)
		return m_symtab;

ObjectFile *objfile = module_sp->GetObjectFile();		ObjectFile *objfile = module_sp->GetObjectFile();
if (objfile) {		if (!objfile)
// Get symbol table from unified section list.
return objfile->GetSymtab();
}
}
return nullptr;		return nullptr;

		m_symtab = objfile->GetSymtab();
		if (m_symtab && m_sym_file_ap)
		m_sym_file_ap->AddSymbols(*m_symtab);

		return m_symtab;
}		}

void SymbolVendor::ClearSymtab() {		void SymbolVendor::ClearSymtab() {
ModuleSP module_sp(GetModule());		ModuleSP module_sp(GetModule());
if (module_sp) {		if (module_sp) {
ObjectFile *objfile = module_sp->GetObjectFile();		ObjectFile *objfile = module_sp->GetObjectFile();
if (objfile) {		if (objfile) {
// Clear symbol table from unified section list.		// Clear symbol table from unified section list.
Show All 30 Lines

unittests/SymbolFile/PDB/SymbolFilePDBTests.cpp

Show First 20 Lines • Show All 610 Lines • ▼ Show 20 Lines	TEST_F(SymbolFilePDBTests, TestNullName) {
SymbolContext sc;		SymbolContext sc;
llvm::DenseSet<SymbolFile *> searched_files;		llvm::DenseSet<SymbolFile *> searched_files;
TypeMap results;		TypeMap results;
uint32_t num_results = symfile->FindTypes(sc, ConstString(), nullptr, false,		uint32_t num_results = symfile->FindTypes(sc, ConstString(), nullptr, false,
0, searched_files, results);		0, searched_files, results);
EXPECT_EQ(0u, num_results);		EXPECT_EQ(0u, num_results);
EXPECT_EQ(0u, results.GetSize());		EXPECT_EQ(0u, results.GetSize());
}		}

		TEST_F(SymbolFilePDBTests, TestFindSymbolsWithNameAndType) {
		FileSpec fspec(m_pdb_test_exe.c_str(), false);
		ArchSpec aspec("i686-pc-windows");
		lldb::ModuleSP module = std::make_shared<Module>(fspec, aspec);

		SymbolContextList sc_list;
		EXPECT_EQ(1u,
		module->FindSymbolsWithNameAndType(ConstString("?foo@@YAHH@Z"),
		lldb::eSymbolTypeAny, sc_list));
		EXPECT_EQ(1u, sc_list.GetSize());

		SymbolContext sc;
		EXPECT_TRUE(sc_list.GetContextAtIndex(0, sc));
		EXPECT_STREQ("int foo(int)",
		sc.GetFunctionName(Mangled::ePreferDemangled).AsCString());
		}