Page MenuHomePhabricator

Support template instantiation in the expression evaluator
Needs ReviewPublic

Authored by jarin on Oct 22 2019, 8:49 AM.

Details

Summary

WORK IN PROGRESS

This change is mostly for discussion, it is messy and likely wrong as I am new to the lldb codebase.

The goal of this exercise is to support expressions containing template instantiations, e.g. foo<int>::x.

The idea here is to collect all template types in the dwarf index. Later, when asked to find a name X, we import all the instantiations of X and pass the template declaration with the instantiations to Clang. In addition to my poor abstraction skills, the major problem with this is that we need to import all the instantiations even when only one is used. I tried to only import forward declarations, but if the instantiations were already previously completed, the AST importer still imports everything. This leads to massive evaluation times (tens of seconds) on templates with many instantiations (std::vector and the likes).

I am planning to pursue a different avenue, where I would introduce another hook into clang (perhaps in Sema::CheckTemplateIdType), but I am not sure how viable that path is.

Diff Detail

Event Timeline

jarin created this revision.Oct 22 2019, 8:49 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 22 2019, 8:49 AM

Will properly review this later, but this seems like a patch that might be related to the performance issue here: https://reviews.llvm.org/D41416

jarin updated this revision to Diff 230069.Nov 19 2019, 7:25 AM

This update introduces a callback from clang for template specialization. The callback allows lldb to construct instantiations on demand, rather than having to create the instantiation eagerly.

Perhaps it would still beneficial to prune the instantiations (we only need one "sample" instantiation per template) when querying the generic type from the symbol file so that we do not have to parse all the instantiations eagerly.

friss added a subscriber: friss.Nov 19 2019, 4:32 PM

Sorry that I haven't reviewed the patch, but there's something I'd like to point out before anyone invests a lot of time into plugin holes in our current template support code.

It would be great to fix the way templates are represented, because currently the debug info doesn't allow us to answer Clang's requests correctly. There is the beginning of a discussion here: http://lists.llvm.org/pipermail/lldb-commits/Week-of-Mon-20180507/040689.html

Basically, today the debug info will describe an entity named "Foo<int>". The accelerator tables all reference this name. So when Clang asks us if we know "Foo" (which is what happens when instantiating), we fail to find the right instantiations. The consensus of the above discussion was that we should change the debug info to have "Foo" as the name of any instantiation, with a child DIE describing the template arguments. Just doing this in the compiler causes test failures in LLDB, so there's some work to do in LLDB to support this.

Sorry that I haven't reviewed the patch, but there's something I'd like to point out before anyone invests a lot of time into plugin holes in our current template support code.

It would be great to fix the way templates are represented, because currently the debug info doesn't allow us to answer Clang's requests correctly. There is the beginning of a discussion here: http://lists.llvm.org/pipermail/lldb-commits/Week-of-Mon-20180507/040689.html

Basically, today the debug info will describe an entity named "Foo<int>". The accelerator tables all reference this name. So when Clang asks us if we know "Foo" (which is what happens when instantiating), we fail to find the right instantiations. The consensus of the above discussion was that we should change the debug info to have "Foo" as the name of any instantiation, with a child DIE describing the template arguments. Just doing this in the compiler causes test failures in LLDB, so there's some work to do in LLDB to support this.

Having an entity for the template itself would be great. However, that would require compiler changes, so only the code compiled with new compilers would benefit, no? I am afraid we need a story for older toolchains, too.

In this patch, I do it in a really hacky way that does not work with accelerator tables, it only works with ManualDWARFIndex by extracting template names from instantiated templates.

I think this approach looks already much more feasible than the last iteration.

+Richard because he knows better how FindClassTemplateSpecialization should look like to fit into the other lookup mechanisms we have in the ExternalASTSource (and I guess there is some more work needed to not violate some AST invariants and do this lookup in the right places). And +Vassil because he is probably also interested in having something like this for Modules at some point.

Sorry that I haven't reviewed the patch, but there's something I'd like to point out before anyone invests a lot of time into plugin holes in our current template support code.

It would be great to fix the way templates are represented, because currently the debug info doesn't allow us to answer Clang's requests correctly. There is the beginning of a discussion here: http://lists.llvm.org/pipermail/lldb-commits/Week-of-Mon-20180507/040689.html

Basically, today the debug info will describe an entity named "Foo<int>". The accelerator tables all reference this name. So when Clang asks us if we know "Foo" (which is what happens when instantiating), we fail to find the right instantiations. The consensus of the above discussion was that we should change the debug info to have "Foo" as the name of any instantiation, with a child DIE describing the template arguments. Just doing this in the compiler causes test failures in LLDB, so there's some work to do in LLDB to support this.

Having an entity for the template itself would be great. However, that would require compiler changes, so only the code compiled with new compilers would benefit, no? I am afraid we need a story for older toolchains, too.

Well, I think that's a question of tradeoffs -- i.e., how many hoops you would need to jump through to support the old compiler. Since the "manual" index needs to scan the debug info anyway, I think that the cost of supporting this flow (both in terms of runtime and code complexity) would be sufficiently low to enable that. However, for the apple/dwarf5 indexes the cost (both kinds) might be prohibitively high...

I don't think that supporting this only in the "manual" case would be that bad, but would still probably want to consider the problem as a whole, and come up with an accelerator story too, so that they can eventually catch up in terms of functionality...

friss added a comment.Nov 21 2019, 8:02 AM

Sorry that I haven't reviewed the patch, but there's something I'd like to point out before anyone invests a lot of time into plugin holes in our current template support code.

It would be great to fix the way templates are represented, because currently the debug info doesn't allow us to answer Clang's requests correctly. There is the beginning of a discussion here: http://lists.llvm.org/pipermail/lldb-commits/Week-of-Mon-20180507/040689.html

Basically, today the debug info will describe an entity named "Foo<int>". The accelerator tables all reference this name. So when Clang asks us if we know "Foo" (which is what happens when instantiating), we fail to find the right instantiations. The consensus of the above discussion was that we should change the debug info to have "Foo" as the name of any instantiation, with a child DIE describing the template arguments. Just doing this in the compiler causes test failures in LLDB, so there's some work to do in LLDB to support this.

Having an entity for the template itself would be great. However, that would require compiler changes, so only the code compiled with new compilers would benefit, no? I am afraid we need a story for older toolchains, too.

Finding a way to handle older toolchains would be great, but not essential IMO. I would hope that when most people upgrade their LLDB, they also get a Clang update. We need to be careful not to break what is already working with older compilers though.

I wanted to make sure that people understand about how templates are done in DWARF and the implications, just for completeness:

  1. DWARF represents only specializations of templates (foo<int>) and not the generic template definition in terms of type T (foo<T>)
  2. DWARF only creates the methods that are used in each compile unit. So we might only have std::vector<int>::erase() and std::vector<int>::back() in one compile unit and std::vector<int>::operator[](size_t) in another
  3. The template type definitions we create in the clang AST context therefore have no generic template methods. The definition for foo<T> has no methods. All methods for the specialized type (foo<int>) are treated as specializations specific to the <int> type. We have to do this since we item #2 above might only have a fraction of the real template definition's methods.
  4. #3 means that we must iterate over all instances of foo<int> to find as many methods as possible when creating the type since each one might only have a fraction of all methods from the original foo<T> definition in the code.

Any fixes we check in should:

  • make sure that all tests that were previously passing continue to pass if we do any modifications.
  • work for all cases (manual indexing, accelerator tables from apple and DWARF)

https://reviews.llvm.org/D41416 could be relevant. I am not an expert but I think when reading the DWARF you could only register 'lazy' specializations which will be imported only when really required.

https://reviews.llvm.org/D41416 could be relevant. I am not an expert but I think when reading the DWARF you could only register 'lazy' specializations which will be imported only when really required.

https://reviews.llvm.org/D41416 is indeed relevant, but it has not been landed. I suffer from the same problem you do - lldb needs all the specializations, but providing those takes too much time. Are you still planning to land your granular lazy specialization change?

Basically, today the debug info will describe an entity named "Foo<int>". The accelerator tables all reference this name. So when Clang asks us if we know "Foo" (which is what happens when instantiating), we fail to find the right instantiations. The consensus of the above discussion was that we should change the debug info to have "Foo" as the name of any instantiation, with a child DIE describing the template arguments. Just doing this in the compiler causes test failures in LLDB, so there's some work to do in LLDB to support this.

Frederic, you say that "doing this in the compiler causes test failures in LLDB", which implies you have tried adding the template in the compiler. Do you have that compiler patch lying around so that we could have a look at what can be done on the lldb side?

I agree that a good long term fix is to have "Foo" as an entity in DWARF, although for backwards compatibility it might be better if the "Foo" template just contained references to the instantiations rather than having them as children.

Basically, today the debug info will describe an entity named "Foo<int>". The accelerator tables all reference this name. So when Clang asks us if we know "Foo" (which is what happens when instantiating), we fail to find the right instantiations. The consensus of the above discussion was that we should change the debug info to have "Foo" as the name of any instantiation, with a child DIE describing the template arguments. Just doing this in the compiler causes test failures in LLDB, so there's some work to do in LLDB to support this.

Frederic, you say that "doing this in the compiler causes test failures in LLDB", which implies you have tried adding the template in the compiler. Do you have that compiler patch lying around so that we could have a look at what can be done on the lldb side?

I agree that a good long term fix is to have "Foo" as an entity in DWARF, although for backwards compatibility it might be better if the "Foo" template just contained references to the instantiations rather than having them as children.

I am afraid you're overestimating the scope of that idea. I *think* that Fred was referring to simply changing the string that gets put into the DW_AT_name field of the /instantation/ (and, by extension, the accelerator table). The debug info would still describe instantiations only.

I don't believe anyone here was proposing to have DWARF actually describe templates themselves -- that might be possible, but you'd have to get pretty creative with dwarf attributes (and convince a bunch of people that this is actually a good idea).

jarin added a comment.Dec 17 2019, 1:43 AM

Basically, today the debug info will describe an entity named "Foo<int>". The accelerator tables all reference this name. So when Clang asks us if we know "Foo" (which is what happens when instantiating), we fail to find the right instantiations. The consensus of the above discussion was that we should change the debug info to have "Foo" as the name of any instantiation, with a child DIE describing the template arguments. Just doing this in the compiler causes test failures in LLDB, so there's some work to do in LLDB to support this.

Frederic, you say that "doing this in the compiler causes test failures in LLDB", which implies you have tried adding the template in the compiler. Do you have that compiler patch lying around so that we could have a look at what can be done on the lldb side?

I agree that a good long term fix is to have "Foo" as an entity in DWARF, although for backwards compatibility it might be better if the "Foo" template just contained references to the instantiations rather than having them as children.

I am afraid you're overestimating the scope of that idea. I *think* that Fred was referring to simply changing the string that gets put into the DW_AT_name field of the /instantation/ (and, by extension, the accelerator table). The debug info would still describe instantiations only.

I don't believe anyone here was proposing to have DWARF actually describe templates themselves -- that might be possible, but you'd have to get pretty creative with dwarf attributes (and convince a bunch of people that this is actually a good idea).

You are right, I was indeed imagining something else. What Fred proposes makes a lot of sense to me, although the lookup of instantiations might be slow because there would be no accelerator for that (but we could build one).

What Fred proposes makes a lot of sense to me, although the lookup of instantiations might be slow because there would be no accelerator for that (but we could build one).

Do we ever lookup instantiations through this API? Seems pretty fragile, due to all the template brackets and all kinds of whitespace one could insert between them...

jarin added a comment.Dec 17 2019, 2:04 AM

What Fred proposes makes a lot of sense to me, although the lookup of instantiations might be slow because there would be no accelerator for that (but we could build one).

Do we ever lookup instantiations through this API? Seems pretty fragile, due to all the template brackets and all kinds of whitespace one could insert between them...

I did it in this patch and it worked, but I agree it is brittle. Other than that, I think it is not used.

friss added a comment.Dec 17 2019, 8:00 AM

Basically, today the debug info will describe an entity named "Foo<int>". The accelerator tables all reference this name. So when Clang asks us if we know "Foo" (which is what happens when instantiating), we fail to find the right instantiations. The consensus of the above discussion was that we should change the debug info to have "Foo" as the name of any instantiation, with a child DIE describing the template arguments. Just doing this in the compiler causes test failures in LLDB, so there's some work to do in LLDB to support this.

Frederic, you say that "doing this in the compiler causes test failures in LLDB", which implies you have tried adding the template in the compiler. Do you have that compiler patch lying around so that we could have a look at what can be done on the lldb side?

I agree that a good long term fix is to have "Foo" as an entity in DWARF, although for backwards compatibility it might be better if the "Foo" template just contained references to the instantiations rather than having them as children.

I am afraid you're overestimating the scope of that idea. I *think* that Fred was referring to simply changing the string that gets put into the DW_AT_name field of the /instantation/ (and, by extension, the accelerator table). The debug info would still describe instantiations only.

Just confirming that this is indeed what I meant. I don't have the patch handy anymore (it was extremely small, about 5 lines IIRC).

Basically, today the debug info will describe an entity named "Foo<int>". The accelerator tables all reference this name. So when Clang asks us if we know "Foo" (which is what happens when instantiating), we fail to find the right instantiations. The consensus of the above discussion was that we should change the debug info to have "Foo" as the name of any instantiation, with a child DIE describing the template arguments. Just doing this in the compiler causes test failures in LLDB, so there's some work to do in LLDB to support this.

Frederic, you say that "doing this in the compiler causes test failures in LLDB", which implies you have tried adding the template in the compiler. Do you have that compiler patch lying around so that we could have a look at what can be done on the lldb side?

I agree that a good long term fix is to have "Foo" as an entity in DWARF, although for backwards compatibility it might be better if the "Foo" template just contained references to the instantiations rather than having them as children.

I am afraid you're overestimating the scope of that idea. I *think* that Fred was referring to simply changing the string that gets put into the DW_AT_name field of the /instantation/ (and, by extension, the accelerator table). The debug info would still describe instantiations only.

Just confirming that this is indeed what I meant. I don't have the patch handy anymore (it was extremely small, about 5 lines IIRC).

FWIW, the Sony debugger throws away the <type> part of the DW_AT_name and reconstructs it from the template_parameter children. The presence of typedefs and/or enums can make the <type> text inconsistent, especially across enums. Our debugger reconstructs the type-parameters because it's more consistent that way.

jarin added a comment.Jan 23 2020, 2:03 AM

FWIW, the Sony debugger throws away the <type> part of the DW_AT_name and reconstructs it from the template_parameter children. The presence of typedefs and/or enums can make the <type> text inconsistent, especially across enums. Our debugger reconstructs the type-parameters because it's more consistent that way.

We have not seen any problems yet, but it does make sense to protect ourselves against different compilers. (I am not quite sure when I'll get back to this patch again as the current version has been good enough for our branch so far.)