This is an archive of the discontinued LLVM Phabricator instance.

[NativePDB] Add basic support of methods recostruction in AST
ClosedPublic

Authored by aleksandr.urakov on Dec 28 2018, 4:23 AM.

Details

Summary

This patch adds the basic support of methods reconstruction by native PDB plugin. It contains only most obvious changes (it processes LF_ONEMETHOD and LF_METHOD records), some things still remain unsolved:

  • mangled names retrieving;
  • support of template methods (they are not presented as LF_ONEMETHOD records at all).

This info is contained in the Symbols stream, not in the TPI stream. As far as I understand, we can't find in a simple way the link between a LF_ONEMETHOD record and a corresponding S_GPROC32 record. I think it's the place, where we need to use the approach similar to D54053 (to parse mangled names and to build some part of AST based on this info). That's what I'm planning to implement next. Do you have any objections on this? Or may be there's a better way to solve this?

Diff Detail

Repository
rL LLVM

Event Timeline

This info is contained in the Symbols stream, not in the TPI stream. As far as I understand, we can't find in a simple way the link between a LF_ONEMETHOD record and a corresponding S_GPROC32 record. I think it's the place, where we need to use the approach similar to D54053 (to parse mangled names and to build some part of AST based on this info). That's what I'm planning to implement next. Do you have any objections on this? Or may be there's a better way to solve this?

I think it's probably best to skip this part for now and come back to it later. The only thing that will be missing is the ability to use member function templates in expressions. Of course we need this eventually, but probably there is more useful stuff to work on first.

One idea for implementing this though might be to add a pre-processing step of the publics stream similar to how we pre-process the TPI stream. This would also allow us to find mangled names of global functions as well (currently even for global functions, we create an empty Mangled structure when LLDB asks us for the mangled name). One pass over the publics stream should be simpler and more straightforward than a pass over every single module's symbol stream. From there we have the address, which tells us the module, and then we can use the CompilandIndexItem::FindSymbolsByVa() to find the S_GPROC32 record, and from there find the type. But I think this is a large effort for low value, so we should maybe wait until the easier stuff is done first. I haven't looked at the rest of the patch yet, but I will in a little bit. Thanks!

aleksandr.urakov marked an inline comment as done.Dec 29 2018, 1:12 AM

I think it's probably best to skip this part for now and come back to it later. The only thing that will be missing is the ability to use member function templates in expressions. Of course we need this eventually, but probably there is more useful stuff to work on first.

Yes, I agree. So I can continue with virtual bases support (but only since the middle of January), good?

One idea for implementing this though might be to add a pre-processing step of the publics stream similar to how we pre-process the TPI stream. This would also allow us to find mangled names of global functions as well (currently even for global functions, we create an empty Mangled structure when LLDB asks us for the mangled name). One pass over the publics stream should be simpler and more straightforward than a pass over every single module's symbol stream. From there we have the address, which tells us the module, and then we can use the CompilandIndexItem::FindSymbolsByVa() to find the S_GPROC32 record, and from there find the type. But I think this is a large effort for low value, so we should maybe wait until the easier stuff is done first. I haven't looked at the rest of the patch yet, but I will in a little bit. Thanks!

Thanks for comment! I think that such preprocessing is a good idea. But is it guaranteed that we will have a record in the publics stream for any function?

lit/SymbolFile/NativePDB/ast-methods.cpp
33 ↗(On Diff #179619)

The one thing I've forgot to say about is that parameter names are in Symbols stream too. Is it ok for now to proceed without them too?

zturner added inline comments.Dec 29 2018, 2:03 PM
lit/SymbolFile/NativePDB/ast-methods.cpp
28 ↗(On Diff #179619)

I think this is wrong. If I compile this file myself and look at the debug info, the debug info says it has cdecl calling convention.

D:\src\llvmbuild\ninja-x64>bin\llvm-pdbutil.exe dump -types foo.pdb | grep -C 5 simple_method
   0x106D | LF_METHODLIST [size = 20]
            - Method [type = 0x106B, vftable offset = -1, attrs = public compiler-generated]
            - Method [type = 0x106C, vftable offset = -1, attrs = public compiler-generated]
   0x106E | LF_FIELDLIST [size = 152]
            - LF_VFUNCTAB type = 0x1056
            - LF_ONEMETHOD [name = `simple_method`]
              type = 0x1059, vftable offset = -1, attrs = public
            - LF_ONEMETHOD [name = `virtual_method`]
              type = 0x1059, vftable offset = 0, attrs = public intro virtual
            - LF_ONEMETHOD [name = `static_method`]
              type = 0x105A, vftable offset = -1, attrs = public static

D:\src\llvmbuild\ninja-x64>bin\llvm-pdbutil.exe dump -types -type-index=0x1059 foo.pdb


                     Types (TPI Stream)
============================================================
  Showing 1 records.
   0x1059 | LF_MFUNCTION [size = 28]
            return type = 0x0003 (void), # args = 0, param list = 0x1011
            class type = 0x1057, this type = 0x1058, this adjust = 0
            calling conv = cdecl, options = None

Also, if I use clang-cl and pass the -ast-dump option, it will look like this:

| |-CXXMethodDecl 0x23869a32c38 <line:2:3, col:25> col:8 simple_method 'void ()'
| | `-CompoundStmt 0x23869a336f0 <col:24, col:25>

So it doesn't have __attribute__((thiscall)) as you do in the test. Could you double check this?

33 ↗(On Diff #179619)

Parameter names should only be important once you are debugging inside of a function, and at that point we will have already located the S_GPROC32 naturally. So I think it's ok for us to leave the parameter names blank, and update the AST lazily whenever we encounter an S_GPROC32 or S_LPROC32 naturally. We actually already parse the function parameter names in this method (See PdbAstBuilder::GetOrCreateFunctionDecl), so in fact I think everything will already mostly work, with only some minor code changes needed to support both global functions as well as member functions.

source/Plugins/SymbolFile/NativePDB/PdbAstBuilder.cpp
669–672 ↗(On Diff #179619)

I think there is a problem here.

aleksandr.urakov marked an inline comment as done.Dec 29 2018, 9:11 PM
aleksandr.urakov added inline comments.
lit/SymbolFile/NativePDB/ast-methods.cpp
28 ↗(On Diff #179619)

Hm... I tested it on x86, and I had there a "thiscall" attribute. It seems that on x64 it is not so. I think it's because the struct contains no fields and the methods doesn't use them - they don't need a "this" pointer. But I can't understand, why is there still "thiscall" on x86. Unfortunately, I can check it only two weeks later, but thanks for catching that!

source/Plugins/SymbolFile/NativePDB/PdbAstBuilder.cpp
669–672 ↗(On Diff #179619)

I can't understand, why?..

This info is contained in the Symbols stream, not in the TPI stream. As far as I understand, we can't find in a simple way the link between a LF_ONEMETHOD record and a corresponding S_GPROC32 record. I think it's the place, where we need to use the approach similar to D54053 (to parse mangled names and to build some part of AST based on this info). That's what I'm planning to implement next. Do you have any objections on this? Or may be there's a better way to solve this?

I think it's probably best to skip this part for now and come back to it later. The only thing that will be missing is the ability to use member function templates in expressions. Of course we need this eventually, but probably there is more useful stuff to work on first.

You might be interested to know that member function templates don't currently work in dwarf land either. I don't understand the full background, but the issue has something to do with the fact that they (and their debug info in dwarf) are emitted only when used, which means that the class declaration can end up being different depending on where you reconstruct it from. This leads to problems if two class definitions (e.g., from two different modules) end up being used in the same expression. There's a longer discussion about the problem here: http://lists.llvm.org/pipermail/llvm-dev/2018-June/124067.html

@zturner Sorry for the long delay with this update. I've applied the changes we were talking about.

@labath Thanks for sharing this, it's interesting to know!

Ping! Can you take a look, please?

zturner accepted this revision.Jan 28 2019, 10:22 AM
This revision is now accepted and ready to land.Jan 28 2019, 10:22 AM

Thanks for the reminder!

This revision was automatically updated to reflect the committed changes.