This is an archive of the discontinued LLVM Phabricator instance.

Add PDBASTParser and parse type information from PDB
ClosedPublic

Authored by zturner on Apr 6 2016, 4:45 PM.

Details

Summary

This code is still untested aside from that it compiles. I mostly want to put an early work-in-progress up here to make sure I'm on the right path and am not doing anything fundamentally wrong. Not all types are supported, and I don't intend to support all types with this initial commit. Things like const / volatile are not important for a first pass, and I'm only looking at getting the most common basic types of types working.

One oddity of PDB is that the debug info does not maintain enough information to accurately reconstruct the DeclContext hierarchy. If you have this:

namespace Foo
{
    class Bar
    {
        class Baz
        {
        };
    };
}

then that will appear in the PDB as a type with the name Foo::Bar::Baz, and there's no information about whether Foo and Bar are namespaces or classes. It is possible to give a best effort attempt, but it's going to be outside the scope of this patch. So, for now, I intend to put every single type under the master translation unit decl with fully scoped names. This isn't perfect, but we can iterate on it in the future.

Again, this is a work in progress, mostly just want to make sure this looks like the right approach.

Diff Detail

Event Timeline

zturner updated this revision to Diff 52870.Apr 6 2016, 4:45 PM
zturner retitled this revision from to Add PDBASTParser and parse type information from PDB.
zturner updated this object.
zturner added a reviewer: clayborg.
zturner added a subscriber: lldb-commits.
clayborg requested changes to this revision.Apr 7 2016, 9:50 AM
clayborg edited edge metadata.

A few minor changes, but you are on the right track.

source/Plugins/SymbolFile/PDB/PDBASTParser.cpp
76–122

The byte sizes of your basic types need to come from getting the bit size of your basic types in your current clang::ASTContext which you can get from your m_ast:

clang::ASTContext *clang_ast = m_ast.getASTContext();

See the function ClangASTContext::GetBuiltinTypeForDWARFEncodingAndBitSize(...) for more hints.

As a hint this is how your float cases should look:

clang::ASTContext *ast = m_ast.getASTContext();
switch (type.getBuiltinType())
{
    case PDB_BuiltinType::Float:
        if (type.getLength()*8 == ast->getTypeSize(ast->FloatTy))
            return "float";
        if (type.getLength()*8 == ast->getTypeSize(ast->DoubleTy))
            return "double";
        if (type.getLength()*8 == ast->getTypeSize(ast->LongDoubleTy))
            return "long double";
        break;

Same goes for all other types.

155–157

If you are creating this type as a forward declaration and you want it to complete itself later, you will need to enable this with:

m_ast.SetHasExternalStorage (ClangUtil::GetQualType(clang_type)GetOpaqueQualType(), true);
184

If you are creating this type as a forward declaration and you want it to complete itself later, you will need to enable this with:

m_ast.SetHasExternalStorage (ClangUtil::GetQualType(ast_enum)GetOpaqueQualType(), true);

FYI: we don't leave enums as forward types, we just make them up right on the spot. So if you choose to let them complete themselves lazily, you might run into problems and need to fix some clang bugs.

source/Plugins/SymbolFile/PDB/SymbolFilePDB.cpp
411–412

This function is only needed for -gmodules so no need to worry about this working very well for your implementation unless PDB files have external file references where they say "I have some type named 'foo' that is in namespace 'bar' and I need to find it in another PDB file".

This revision now requires changes to proceed.Apr 7 2016, 9:50 AM
zturner added inline comments.Apr 7 2016, 11:37 AM
source/Plugins/SymbolFile/PDB/PDBASTParser.cpp
155–157

Does this mean I also need to pass eResolveStateForward to Type::Type()?

See inlined comments.

source/Plugins/SymbolFile/PDB/PDBASTParser.cpp
155–157

Yes

186

Make sure you complete your enum type, or set the resolve state to eResolveStateForward

One oddity of PDB is that the debug info does not maintain enough information to accurately reconstruct the DeclContext hierarchy. If you have this:

namespace Foo
{
    class Bar
    {
        class Baz
        {
        };
    };
}

then that will appear in the PDB as a type with the name Foo::Bar::Baz, and there's no information about whether Foo and Bar are namespaces or classes. It is possible to give a best effort attempt, but it's going to be outside the scope of this patch. So, for now, I intend to put every single type under the master translation unit decl with fully scoped names. This isn't perfect, but we can iterate on it in the future.

Again, this is a work in progress, mostly just want to make sure this looks like the right approach.

Couldn't you always just lookup "Foo::Bar" and see if you find a class? And then lookup "Foo" and find a namespace?

One oddity of PDB is that the debug info does not maintain enough information to accurately reconstruct the DeclContext hierarchy. If you have this:

namespace Foo
{
    class Bar
    {
        class Baz
        {
        };
    };
}

then that will appear in the PDB as a type with the name Foo::Bar::Baz, and there's no information about whether Foo and Bar are namespaces or classes. It is possible to give a best effort attempt, but it's going to be outside the scope of this patch. So, for now, I intend to put every single type under the master translation unit decl with fully scoped names. This isn't perfect, but we can iterate on it in the future.

Again, this is a work in progress, mostly just want to make sure this looks like the right approach.

Couldn't you always just lookup "Foo::Bar" and see if you find a class? And then lookup "Foo" and find a namespace?

If you lookup "Foo" you just wouldn't find anything, because namespaces are not stored anywhere. So you could then say "Ok I guess Foo is a namespace", but it's a little guess-y. Then you have to deal with classes defined in functions, make sure anonymous namespaces work, and things like that. In the end I can probably get pretty accurate, but it'll be a lot of extra work so I rather do it in small pieces.

Greg, what should happen when you look up a builtin type like "char" or "float"? Is there supposed to be a unique TypeSP for this somewhere, with a unique uid?

zturner added a comment.EditedApr 8 2016, 2:51 PM

I mean for example if someone calls FindTypes(..., "char", ...);

Should a TypeSP go into the map?

I mean for example if someone calls FindTypes(..., "char", ...);

Should a TypeSP go into the map?

No you don't have to make one up if you don't have one. At least this is what the DWARF parser does. The main problem is the definition for "char" can be signed or unsigned depending on how your translation unit was compiled. Probably some other char types can vary as well. No expressions will ask for built in types since the expression parser already knows about them. And it is useful to try and lookup "char" in a SymbolFile to see what the notion that that symbol file has for "char" (signed or unsigned).

So if you actually have one, return it. Or if PDB always has some around and it has some unique identifiers for the builtin types you can easily create them from your clang::ASTContext with the "ast->IntTy" and the many other built int types. But if you don't have any, don't worry about creating them from nothing.

Thanks. One more question. Does SymbolFile::FindTypes need to be able to
handle the case of a null type name (which I guess would indicate to find
every type)?

Thanks. One more question. Does SymbolFile::FindTypes need to be able to handle the case of a null type name (which I guess would indicate to find every type)?

No. It should handle the request without crashing and it should return zero matches, but you don't need to find every type. If we ever did, we would do this with regular expressions.

zturner updated this revision to Diff 53775.Apr 14 2016, 1:17 PM
zturner edited edge metadata.

Changes from first patch:

  1. Based on your other comment that it wasn't actually necessary to create a TypeSP for builtin types, I went ahead and removed all the code that was looking at bit and byte sizes. So all that stuff is now gone.
  1. Changed classes to resolve as forward declares, but I don't actually complete them yet. If it crashes if someone tries to complete it, I'm fine with that for now, I will probably work on completing them next anyway.
  1. Fully completing enums from the pdb file now.
  1. Added a bunch of unit tests.

There's no real way to write non-unit tests for this code yet, because that would require having a PDB and a matching executable and having the test generate it (like it does with dwarf, for example, by just compiling the executable with a Makefile). I could do this with MSVC, but our test Makefile system only works with clang, which won't know how to emit PDBs for a number of months.

In the future once clang knows how to generate pdbs I will update the test suite to run the entire suite with PDB as a 4th kind of debug format in addition to dsym, dwarf, and dwo.

zturner updated this revision to Diff 53780.Apr 14 2016, 1:47 PM
zturner edited edge metadata.

Generated the wrong patch last time. This one should be good

So you will need to make up base types if you have any types that use them. Like if a struct contains an "int", you will need to make an integer up and feed it to your CXXRecordDecl. So you will probably need to add those functions back. You will need them for:

  • class/struct/union members
  • class/struct method arguments and return values
  • arrays
  • plain variables whose types are basic types

That should be fine, but as long as I don't have to create a TypeSP for it when someone calls FindTypes, which is the only thing I'm implementing here, I think I should be ok. When I go to create CXXRecordDecls to complete class types, or when I add support for symbols, I can add those back because then I will need it. Right now I'm only trying to get type lookup working.

clayborg accepted this revision.Apr 14 2016, 2:47 PM
clayborg edited edge metadata.

Sounds good. Lets start with this and add on as needed.

This revision is now accepted and ready to land.Apr 14 2016, 2:47 PM
vsk added a subscriber: vsk.Apr 14 2016, 6:26 PM

Hi @zturner, this seems to have upset an lldb bot. Could you take a look (https://llvm.org/bugs/show_bug.cgi?id=27362)?

A new file was added, Greg or someone else at Apple may need to add it to
the Xcode workspace

Done in r266407. It built for me after this.

Jim

Shouldn't the SymbolFilePDB be only enabled in windows since it is useless on any other system since you are using the Windows system DLL to parse the info?

I wrote a library in llvm which is independent of the Windows system dll
and is abstracted such that we can plug in a non Windows specific version
of a pdb parser once we understand the format well enough.

For now, if you're not on Windows this library will just return an error if
you try to create one, which results in SymbolFilePDB returning 0
capabilities.

In the future though, all we have to do is add a non Windows implementation
of the parser in llvm and everything will just magically work on any
platform.

There's an active effort into understanding the format because clang needs
to know how to generate pdb when cross compiling for Windows on linux, so
there should be something concrete here in the future

I think this one is breaking make check-all for me because it cannot find and copy test-pdb-types.exe. Maybe this has to be checked in? test-dwarf.exe and test-pdb.exe are in SVN as well...

I was under the impression the make build was officially unsupported/dead?

labath added a subscriber: labath.Apr 20 2016, 8:08 AM

I was under the impression the make build was officially unsupported/dead?

You can still cmake to generate makefiles (and it does that by default).

In any case, this problem should not be related to make, unit tests fail even when using ninja:

Error copying file "llvm/tools/lldb/unittests/SymbolFile/PDB/Inputs/test-pdb-types.exe" to "build/dbg/tools/lldb/unittests/SymbolFile/PDB/Inputs".

Hmm, ok i will check it out later. I'm not sure why it's even trying to
copy an exe, should be only doing the pdb. Exe isn't even checked in

I was under the impression the make build was officially unsupported/dead?

You can still cmake to generate makefiles (and it does that by default).

In any case, this problem should not be related to make, unit tests fail even when using ninja:

Error copying file "llvm/tools/lldb/unittests/SymbolFile/PDB/Inputs/test-pdb-types.exe" to "build/dbg/tools/lldb/unittests/SymbolFile/PDB/Inputs".

Yes, that's what I'm seeing here as well. Sorry for not posting the error message

Hmm, ok i will check it out later. I'm not sure why it's even trying to
copy an exe, should be only doing the pdb. Exe isn't even checked in

unittests/SymbolFile/PDB/CMakeLists.txt
9

Maybe because it is specified here and not compiled?

It should be fixed now, please sync and try again

It should be fixed now, please sync and try again

Yes, thanks.

Eugene.Zelenko added a subscriber: Eugene.Zelenko.

Committed in r266392.