This is an archive of the discontinued LLVM Phabricator instance.

Change SymbolFile::ParseTypes to ParseTypesForCompileUnit
ClosedPublic

Authored by zturner on Jan 8 2019, 4:25 PM.

Details

Summary

The function SymbolFile::ParseTypes previously accepted a SymbolContext. This makes it extremely difficult to implement faithfully, because you have to account for all possible combinations of members being set in the SymbolContext. On the other hand, no clients of this function actually care about implementing this function to this strict of a standard. AFAICT, there is actually only 1 client in the entire codebase, and it is the function ParseAllDebugSymbols, which is itself only called for testing purposes when dumping information. At this call-site, the only field it sets is the CompileUnit, meaning that an implementer of a SymbolFile need not worry about any examining or handling any other fields which might be set.

By restricting this API to accept exactly a CompileUnit& and nothing more, we can simplify the life of new SymbolFile plugin implementers by making it clear exactly what the necessary and sufficient set of functionality they need to implement is, while at the same time removing some dead code that tried to handle other types of SymbolContext fields that were never going to be set anyway.

Diff Detail

Repository
rL LLVM

Event Timeline

zturner created this revision.Jan 8 2019, 4:25 PM
labath added a comment.Jan 9 2019, 6:43 AM

Sounds good to me (not clicking accept to see if anyone else has an opinion here).

So this Patch is effectively NFC, since no caller (not even a test) was using the functionality you've removed. Seems like a nice refactor.

lldb/source/Core/Module.cpp
382 ↗(On Diff #180766)

I'd argue that this comment is now obsolete.

So I like the ability to specify any symbol context to parse all types for. If you pass in a symbol context with only a module filled in, we parse all types. If we pass in a symbol context with only the compile unit filled in (no function), we parse all types in the compile unit. If we pass in a symbol context with a function or block, we parse all types in that function or block. Currently it is only being used for compile units, but I don't think we need to change the API.

lldb/source/Plugins/SymbolFile/Symtab/SymbolFileSymtab.cpp
228–238 ↗(On Diff #180766)

All these functions that just return zero, error, fail, should have default implementations in the SymbolFile class IMHO. Keeps diffs down.

So there really aren't that many things:

  • module only filled out, parse all types
  • compile unit only filled out, parse all type from a compile unit
  • function filled out, parse all types in a function
  • block filled out, parse all types in block, or we can skip this one

But if everyone else thinks differently, I'll go with the majority. Just my initial thoughts.

So I like the ability to specify any symbol context to parse all types for. If you pass in a symbol context with only a module filled in, we parse all types. If we pass in a symbol context with only the compile unit filled in (no function), we parse all types in the compile unit. If we pass in a symbol context with a function or block, we parse all types in that function or block. Currently it is only being used for compile units, but I don't think we need to change the API.

I'm actually specifically trying to remove this flexibility in the API because I think it's harmful. In fact as we speak I'm writing up an RFC to send to the mailing list. As someone who just went through implementing a new SymbolFile plugin, it is very difficult to implement these functions faithfully for arbitrary SymbolContexts, and begs for untested code as well as dead code (as evidenced by this patch, there was a bunch of dead code for combinations of SymbolContext fields that no caller actually cared about). So nobody currently even needs this anyway. which means that specifying an arbitrary SymbolContext is a case of YAGNI. If someone comes along one day and needs the ability to parse types in a function or block, they can call ParseTypesForFunction followed by ParseTypesForBlock. This way it is simple and easy for the caller to understand how to use the function, and it is equally simple and easy to understand for the implementer to implement the function. Currently it is complicated for both. For example, does specifying a Module and a Function mean to parse all types for that module or that function? Or does it mean to parse all types for that module and that function.

I would like to remove the ability to specify arbitrary SymbolContext for most of the other SymboFile functions as well (although there are a few where we might still need it).

This revision was not accepted when it landed; it landed in state Needs Review.Jan 10 2019, 1:02 PM
This revision was automatically updated to reflect the committed changes.

But if everyone else thinks differently, I'll go with the majority. Just my initial thoughts.

IIRC, the original point of this ParseTypes function was to have a way for lldb developers to force lldb to parse all the types in some interesting context. The equivalent facility had proved useful in gdb (maint print symbols) to trigger obscure type parsing issues directly. We didn't add a command for it since most often you are debugging lldb with lldb so you can just call it by hand.

When ParseTypes(SymbolContext &sc) gets used in ParseAllDebugSymbols, the calling code was already doing the iteration over CU's to dump all the types in the module. The code to do that iteration is quite simple, so I don't see the need to fold it into the ParseTypes function, and when you want to do the CU iteration by hand while debugging, you can call ParseAllDebugSymbols.

I'm not sure how useful "parsing all the types in a function" is, as types aren't usually defined in functions and so forcing just the ones in the function to get realized NOW isn't all that helpful. This might be more useful if it parsed all the types referred to (i.e. any DW_AT_type), but I don't think that's what it does. It just handles the various DW_TAG_some_kind_of_type DIEs.

So I agree with Zachary that supporting all the possibilities in a SymbolContext seems overly complex to define and not all that helpful. ParseTypesForCompileUnit seems like the right level to do this. It might be nice to have a "Module::ParseAllDebugTypes" for calling by hand, since "Module::ParseAllDebugSymbols" does way more work than needed if you know you are just interested in type parsing. But that's a minor thing.