This is an archive of the discontinued LLVM Phabricator instance.

Clang: fix AST representation of expanded template arguments.
ClosedPublic

Authored by mizvekov on Jun 17 2022, 7:27 PM.

Details

Summary

Extend clang's SubstTemplateTypeParm to represent the pack substitution index.

Fixes PR56099.

Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>

Diff Detail

Event Timeline

mizvekov created this revision.Jun 17 2022, 7:27 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 17 2022, 7:27 PM
mizvekov published this revision for review.Jun 17 2022, 7:58 PM
mizvekov added reviewers: rsmith, v.g.vassilev.
Herald added a project: Restricted Project. · View Herald TranscriptJun 17 2022, 7:58 PM
Herald added a subscriber: cfe-commits. · View Herald Transcript
mizvekov updated this revision to Diff 438117.Jun 18 2022, 5:56 AM
mizvekov edited the summary of this revision. (Show Details)

.

mizvekov updated this revision to Diff 438813.Jun 21 2022, 1:21 PM
martong removed a subscriber: martong.Jul 4 2022, 3:29 AM
mizvekov updated this revision to Diff 445368.Jul 17 2022, 6:30 PM
mizvekov edited the summary of this revision. (Show Details)
shafik added inline comments.Jul 18 2022, 6:55 PM
clang/lib/AST/ASTImporter.cpp
1534

I think we should have a test in ASTImporterTest.cpp to make sure we are importing the pack index correctly.

mizvekov updated this revision to Diff 445993.Jul 19 2022, 5:40 PM
mizvekov marked an inline comment as done.Jul 19 2022, 5:42 PM
mizvekov added inline comments.
clang/lib/AST/ASTImporter.cpp
1534

Thanks for the suggestion, implemented.
It turns out we didn't even have any tests for importing SubsTemplateTypeParmType in the first place.

mizvekov updated this revision to Diff 447988.Jul 27 2022, 3:33 AM
mizvekov marked an inline comment as done.
davrec accepted this revision.Aug 8 2022, 3:01 PM
davrec added a subscriber: davrec.

This corrects a genuine deficiency in the AST, and the patch LGTM. Can we knock this off Matheus' stack?

This revision is now accepted and ready to land.Aug 8 2022, 3:01 PM

This corrects a genuine deficiency in the AST, and the patch LGTM. Can we knock this off Matheus' stack?

Thanks!!!! :-)

shafik accepted this revision.Aug 8 2022, 8:30 PM

LGTM as well

This revision was landed with ongoing or failed builds.Aug 9 2022, 5:26 AM
This revision was automatically updated to reflect the committed changes.

We have a translation unit, on which we see an increase of compilation time and clang memory allocation from 11GB to 14GB. We are working on an isolated case.

We have a translation unit, on which we see an increase of compilation time and clang memory allocation from 11GB to 14GB. We are working on an isolated case.

Thanks for looking into this!

What I can imagine may be happening here is that if we instantiate a template with a very large argument pack consisting of mostly the same template arguments, we will create many more SubstTemplateTypeParmType nodes with this patch, as they won't unique so much anymore, because each will have a different pack index.

We have a translation unit, on which we see an increase of compilation time and clang memory allocation from 11GB to 14GB. We are working on an isolated case.

Thanks for looking into this!

What I can imagine may be happening here is that if we instantiate a template with a very large argument pack consisting of mostly the same template arguments, we will create many more SubstTemplateTypeParmType nodes with this patch, as they won't unique so much anymore, because each will have a different pack index.

If this is indeed a major problem, and/or if performance is an issue for other of the patches in the stack, maybe the solution is along the lines of what I raised in https://discourse.llvm.org/t/rfc-improving-diagnostics-with-template-specialization-resugaring/64294/5: introduce an option that determines how much sugar we construct in the AST, and modify ASTContext::get*Type(...) accordingly.

For now it could have two possible values, 'medium' and 'maximum', and getSubstTemplateTypeParmType(...) could be modified so it always uses PackIndex=0 whenever it is only set to 'medium'. And, for later patches, resugaring would only be enabled when it is set to 'maximum'. (And maybe later a 'minimum' option could be experimented with which disables all sugar except that needed to keep tests passing.)

Adding to the comment @joanahalili posted: this particular translation unit happens to have a large number of reverse dependencies and Clang's memory allocation has pushed it over the limits, which makes this issue quite serious. So far I can't say how widespread are large increases in clang heap allocation, which don't lead to exceeding memory limits used in the build setup and thus are less noticeable. However, I suppose that this should affect various clang-based tools like ciderd. @sammccall @kadircet, fyi.

joanahalili added a comment.EditedAug 23 2022, 7:50 AM

clang -fsyntax-only -std=c++17  -fproc-stat-report -Wno-deprecated-declarations  -fsized-deallocation -Werror -Wno-deprecated-declarations  -Wno-inconsistent-missing-override -Wno-null-conversion -Wno-ignored-attributes  -Wno-defaulted-function-deleted -xc++ reproduction.cpp

This is the reproducer we managed to create for the memory increase. As mentioned above we notice both a difference in memory and execution time.
Clang version before this patch

  • Memory: 8Gb
  • Time: 32507 ms

Clang version with this patch:

  • Memory: 10.5 Gb (2.5Gb increase)
  • Time 63812 (almost double the previous compilation time).

This is the reproducer we managed to create for the memory increase. As mentioned above we notice both a difference in memory and execution time.

Thanks.

I also added a print of the amount of SubstTemplateTypeParmType nodes that were uniqued:

Before:

SubstTemplateTypeParmType: 751088
clang.exe: output=a.exe, total=46687.500 ms, user=46031.250 ms, mem=7793868 Kb

After:

SubstTemplateTypeParmType: 41919586
clang.exe: output=a.exe, total=68828.125 ms, user=67968.750 ms, mem=10511992 Kb

This DR is so basic that I don't really see reasonable way to avoid this cost except by disabling it with a flag.

Though I wonder how useful a SubstTemplateTypeParmType is without the pack index. If that kind of heavy handed meta-programming is reasonable, a flag for disabling substitution sugar entirely does not seem unreasonable...

This is the reproducer we managed to create for the memory increase. As mentioned above we notice both a difference in memory and execution time.

Thanks.

I also added a print of the amount of SubstTemplateTypeParmType nodes that were uniqued:

Before:

SubstTemplateTypeParmType: 751088
clang.exe: output=a.exe, total=46687.500 ms, user=46031.250 ms, mem=7793868 Kb

After:

SubstTemplateTypeParmType: 41919586
clang.exe: output=a.exe, total=68828.125 ms, user=67968.750 ms, mem=10511992 Kb

This DR is so basic that I don't really see reasonable way to avoid this cost except by disabling it with a flag.

Though I wonder how useful a SubstTemplateTypeParmType is without the pack index. If that kind of heavy handed meta-programming is reasonable, a flag for disabling substitution sugar entirely does not seem unreasonable...

I wonder what is the practical application of the substitution index in SubstTemplateTypeParmType? Diagnostics? Matching AST? Something else? What would be the cost of calculating the index when necessary?

I wonder what is the practical application of the substitution index in SubstTemplateTypeParmType? Diagnostics? Matching AST? Something else? What would be the cost of calculating the index when necessary?

The pack index helps you figure out which argument within an argument pack was used in a substitution, much like the ReplacementType has an index which helps you figure out which argument was used.

Calculating it should be possible, if we have the whole type which contains the pattern from where the expansion came from.

The problem is that if we want to extract a sub-type from the whole, say the type of a certain function argument, then we would have to rebuild that type to include this pack index in any substitution sugar. And that sounds expensive as well.

I wonder what is the practical application of the substitution index in SubstTemplateTypeParmType? Diagnostics? Matching AST? Something else? What would be the cost of calculating the index when necessary?

The pack index helps you figure out which argument within an argument pack was used in a substitution, much like the ReplacementType has an index which helps you figure out which argument was used.

Calculating it should be possible, if we have the whole type which contains the pattern from where the expansion came from.

The problem is that if we want to extract a sub-type from the whole, say the type of a certain function argument, then we would have to rebuild that type to include this pack index in any substitution sugar. And that sounds expensive as well.

Do you have a practical example that would use the substitution index? I believe you had something in mind before you implemented this patch?

Do you have a practical example that would use the substitution index? I believe you had something in mind before you implemented this patch?

Oh, yes, this is needed for / is going to culminate in the implementation of the project described in https://discourse.llvm.org/t/rfc-improving-diagnostics-with-template-specialization-resugaring/64294/11

Do you have a practical example that would use the substitution index? I believe you had something in mind before you implemented this patch?

Oh, yes, this is needed for / is going to culminate in the implementation of the project described in https://discourse.llvm.org/t/rfc-improving-diagnostics-with-template-specialization-resugaring/64294/11

The whole project seems like a great improvement in clang diagnostics, but I don't yet see how template parameter pack substitution indices fit into the whole picture. If different type parameters in a parameter pack differ only by some other type of suraring, would they be distinguishable in the current state? How do substitution indices affect this?

The whole project seems like a great improvement in clang diagnostics, but I don't yet see how template parameter pack substitution indices fit into the whole picture. If different type parameters in a parameter pack differ only by some other type of suraring, would they be distinguishable in the current state? How do substitution indices affect this?

We need the pack index just as much as the index.

When we are resugaring some type and we find a SubstTemplateParmType, first we try to find a template specialization in the current naming context (ie a nested name specifier) which refers to the same template as the Subst node.

Then we have to rebuild that Subst node using, as the new underlying type, the template argument which appears in the template specialization we found. To find that argument, first we use the index, which already was available in the AST prior to this patch. But in case that index refers to an argument pack, then we need the pack index to be able to find the argument inside that.

The whole project seems like a great improvement in clang diagnostics, but I don't yet see how template parameter pack substitution indices fit into the whole picture. If different type parameters in a parameter pack differ only by some other type of suraring, would they be distinguishable in the current state? How do substitution indices affect this?

We need the pack index just as much as the index.

When we are resugaring some type and we find a SubstTemplateParmType, first we try to find a template specialization in the current naming context (ie a nested name specifier) which refers to the same template as the Subst node.

Then we have to rebuild that Subst node using, as the new underlying type, the template argument which appears in the template specialization we found. To find that argument, first we use the index, which already was available in the AST prior to this patch. But in case that index refers to an argument pack, then we need the pack index to be able to find the argument inside that.

I more or less understand the general direction, but it would be interesting to look at the specific code that uses this.

The main questions we need to answer here are:

  1. is the cost of storing parameter pack substitution indices while building AST worth the benefits?
  2. is there an alternative solution that would shift the cost to the moment when this information is about to be used (when formatting diagnostic, I suppose)?
  3. is there a good way to make this cost optional (using a flag or somehow else)?

The main questions we need to answer here are:

  1. is the cost of storing parameter pack substitution indices while building AST worth the benefits?
  2. is there an alternative solution that would shift the cost to the moment when this information is about to be used (when formatting diagnostic, I suppose)?
  3. is there a good way to make this cost optional (using a flag or somehow else)?
  1. For me yes, this was a fairly simple patch that got reviewed quickly and got the job done. But it seems it was not good for your use case :)
  2. I think there is a chance the approach I mentioned earlier could work. It's hard for me to tell without spending some time trying. This could though end up being a much more complicated patch, which hits my projects bottleneck: it's pretty hard to find reviewers for this sort of stuff.
  3. That has the advantage of change simplicity, but not user simplicity. It could provide a escape hatch in allowing us to have a simple solution now, but then iterate to something better later. But since this is an externally visible change to the AST, it does seem far from ideal to be changing this twice. It would be pretty unfortunate if someone else started using this feature and then we decided to take it away.

I would be fine though reverting this, and then doing some more investigation, and also perhaps waiting for a second opinion from @rsmith.

The main questions we need to answer here are:

  1. is the cost of storing parameter pack substitution indices while building AST worth the benefits?
  2. is there an alternative solution that would shift the cost to the moment when this information is about to be used (when formatting diagnostic, I suppose)?
  3. is there a good way to make this cost optional (using a flag or somehow else)?
  1. For me yes, this was a fairly simple patch that got reviewed quickly and got the job done. But it seems it was not good for your use case :)
  2. I think there is a chance the approach I mentioned earlier could work. It's hard for me to tell without spending some time trying. This could though end up being a much more complicated patch, which hits my projects bottleneck: it's pretty hard to find reviewers for this sort of stuff.
  3. That has the advantage of change simplicity, but not user simplicity. It could provide a escape hatch in allowing us to have a simple solution now, but then iterate to something better later. But since this is an externally visible change to the AST, it does seem far from ideal to be changing this twice. It would be pretty unfortunate if someone else started using this feature and then we decided to take it away.

I would be fine though reverting this, and then doing some more investigation, and also perhaps waiting for a second opinion from @rsmith.

For now we've added a workaround for the most problematic use case and got us unblocked. Thus there is no need now in introducing additional flags. However, I'm rather concerned about the cumulative additional overhead our build infrastructure gets as a result of this patch. Measuring this would unfortunately be a project of its own, which I'm not ready to commit to. If reverting this and discussing with Richard Smith is an option for you, I'd suggest doing this now and if you end up not finding a good alternative, we would probably need to find a way to estimate the impact of this patch. WDYT?

For now we've added a workaround for the most problematic use case and got us unblocked. Thus there is no need now in introducing additional flags. However, I'm rather concerned about the cumulative additional overhead our build infrastructure gets as a result of this patch. Measuring this would unfortunately be a project of its own, which I'm not ready to commit to. If reverting this and discussing with Richard Smith is an option for you, I'd suggest doing this now and if you end up not finding a good alternative, we would probably need to find a way to estimate the impact of this patch. WDYT?

Thanks, and sorry for the long time to respond to this.

I have been thinking about this problem, and I have an idea for a solution which is better on the performance side, but more complex change.

We would need to implement some pattern matching during AST traversal, and keep track of the pack index there. When extracting a component of the larger type, such as when deducing non-pack arguments from a pack, we would wrap it over with a new sugar type node which encodes the current pack indexes.

I reflected a bit on this, and yeah I think we should try to design something which does not potentially hinder large packs so much.

I am not sure how this is going to play out in practice, and it's possible we may need to come back to this solution again, but I think it's prudent to revert it for now if we are not sure, as this is an API change, however obscure for most users.

As an aside, it would be really helpful if you could track how the compilation of your codebase performs as clang evolves, regardless if we keep this change or not :)
Maybe you can reuse something from http://llvm-compile-time-tracker.com/

For now we've added a workaround for the most problematic use case and got us unblocked. Thus there is no need now in introducing additional flags. However, I'm rather concerned about the cumulative additional overhead our build infrastructure gets as a result of this patch. Measuring this would unfortunately be a project of its own, which I'm not ready to commit to. If reverting this and discussing with Richard Smith is an option for you, I'd suggest doing this now and if you end up not finding a good alternative, we would probably need to find a way to estimate the impact of this patch. WDYT?

Thanks, and sorry for the long time to respond to this.

I have been thinking about this problem, and I have an idea for a solution which is better on the performance side, but more complex change.

We would need to implement some pattern matching during AST traversal, and keep track of the pack index there. When extracting a component of the larger type, such as when deducing non-pack arguments from a pack, we would wrap it over with a new sugar type node which encodes the current pack indexes.

I reflected a bit on this, and yeah I think we should try to design something which does not potentially hinder large packs so much.

I am not sure how this is going to play out in practice, and it's possible we may need to come back to this solution again, but I think it's prudent to revert it for now if we are not sure, as this is an API change, however obscure for most users.

I appreciate your effort on designing an alternative solution. Hopefully, it works out well.

As an aside, it would be really helpful if you could track how the compilation of your codebase performs as clang evolves, regardless if we keep this change or not :)
Maybe you can reuse something from http://llvm-compile-time-tracker.com/

Good point about tracking compile time. We do have compiler performance tests running in a controlled environment with a number of measures in place to reduce errors in results, but they are limited to a rather small sample of build targets. Tracking compiler performance (times, memory used) for all of our code and using this as a regression test for compiler performance is impractical due to the size of the code base, rate of changes, and the design of the build system (high level of parallelism, aggressive caching, shared build cluster consisting of machines with different performance characteristics).

Two last thoughts:
1: To reiterate it doesn't seem right to view this (or any patch adding not-semantically-relevant info to the AST) as a one-size-fits-all costs vs. benefits optimization. On the one hand, the additional AST info hurts compilation performance. On the other, the info might make debugging faster, or be useful to a tool. A flag governing how much non-semantically-necessary info to add to the AST really seems the better solution in general, and warrants more discussion in cases like this (e.g. Matheus's other resugaring patches, which may cause this debate to resurface regardless of the approach he takes here). The user could set the flag high when debugging, and decrease it when diagnostics/debug info are less expected/not needed. (In light of the performance tests done here, the performance benefit of allowing the user to omit *other* non-semantically-necessary nodes from the AST could be significant; such a flag could allow that.)

2: After thinking about it further I don't think the pack index provides sufficiently useful info in any case, since packs will always be expanded in full, in order: when you find the first SubstTemplateTypeParmType expanded from a pack, the rest are sure to be right behind. IIUC I see how including the pack index makes resugaring more straightforward: the substitution of the sugared info for the non-sugared info occurs via a TreeTransform (see clang/lib/Sema/SemaTemplate.cpp in https://reviews.llvm.org/D127695), and by storing the pack index Matheus can simply override TransformSubstTemplateTypeParmType to make use of the pack index to easily fetch the corresponding sugared info. But since the pack indices are predictable given a bird's eye view of the AST, maybe state info can be stored in the TreeTransform to allow the pack index to be inferred in each call to TransformSubstTemplateTypeParmType?

2: After thinking about it further I don't think the pack index provides sufficiently useful info in any case, since packs will always be expanded in full, in order: when you find the first SubstTemplateTypeParmType expanded from a pack, the rest are sure to be right behind. IIUC I see how including the pack index makes resugaring more straightforward: the substitution of the sugared info for the non-sugared info occurs via a TreeTransform (see clang/lib/Sema/SemaTemplate.cpp in https://reviews.llvm.org/D127695), and by storing the pack index Matheus can simply override TransformSubstTemplateTypeParmType to make use of the pack index to easily fetch the corresponding sugared info. But since the pack indices are predictable given a bird's eye view of the AST, maybe state info can be stored in the TreeTransform to allow the pack index to be inferred in each call to TransformSubstTemplateTypeParmType?

Packs are expanded from patterns, which is an arbitrary type that will reference one or more parameter pack anywhere within. So going back from an expanded pack back to a pattern + number of expansions will take more effort and we may need to add new markings to the AST to help with that.

Check this example out: https://godbolt.org/z/rsGsM6GrM

template <class ...Ts> struct A {
  template <class ...Us> struct B {
      using type1 = void ((void (*...fps)(Ts, Us)));
  };
};
using type2 = A<int, char>::B<short, bool>::type1;
TypeAliasDecl 0x55ffe8b45368 <line:8:1, col:45> col:7 type2 'A<int, char>::B<short, bool>::type1':'void ((void (*)(int, short), void (*)(char, bool)))'
  `-ElaboratedType 0x55ffe8b452f0 'A<int, char>::B<short, bool>::type1' sugar
    `-TypedefType 0x55ffe8b452d0 'A<int, char>::B<short, bool>::type1' sugar
      |-TypeAlias 0x55ffe8b45258 'type1'
      `-FunctionProtoType 0x55ffe8b451e0 'void ((void (*)(int, short), void (*)(char, bool)))' cdecl
        |-ParenType 0x55ffe8b12150 'void' sugar
        | `-BuiltinType 0x55ffe8ac6370 'void'
        |-PointerType 0x55ffe8b44550 'void (*)(int, short)'
        | `-ParenType 0x55ffe8b444f0 'void (int, short)' sugar
        |   `-FunctionProtoType 0x55ffe8b444b0 'void (int, short)' cdecl
        |     |-BuiltinType 0x55ffe8ac6370 'void'
        |     |-SubstTemplateTypeParmType 0x55ffe8b44310 'int' sugar
        |     | |-TemplateTypeParmType 0x55ffe8b11440 'Ts' dependent contains_unexpanded_pack depth 0 index 0 pack
        |     | | `-TemplateTypeParm 0x55ffe8b113c0 'Ts'
        |     | `-BuiltinType 0x55ffe8ac6410 'int'
        |     `-SubstTemplateTypeParmType 0x55ffe8b443c0 'short' sugar
        |       |-TemplateTypeParmType 0x55ffe8b11960 'Us' dependent contains_unexpanded_pack depth 1 index 0 pack
        |       | `-TemplateTypeParm 0x55ffe8b118d8 'Us'
        |       `-BuiltinType 0x55ffe8ac63f0 'short'
        `-PointerType 0x55ffe8b450c0 'void (*)(char, bool)'
          `-ParenType 0x55ffe8b45060 'void (char, bool)' sugar
            `-FunctionProtoType 0x55ffe8b447d0 'void (char, bool)' cdecl
              |-BuiltinType 0x55ffe8ac6370 'void'
              |-SubstTemplateTypeParmType 0x55ffe8b44630 'char' sugar
              | |-TemplateTypeParmType 0x55ffe8b11440 'Ts' dependent contains_unexpanded_pack depth 0 index 0 pack
              | | `-TemplateTypeParm 0x55ffe8b113c0 'Ts'
              | `-BuiltinType 0x55ffe8ac63b0 'char'
              `-SubstTemplateTypeParmType 0x55ffe8b446e0 'bool' sugar
                |-TemplateTypeParmType 0x55ffe8b11960 'Us' dependent contains_unexpanded_pack depth 1 index 0 pack
                | `-TemplateTypeParm 0x55ffe8b118d8 'Us'
                `-BuiltinType 0x55ffe8ac6390 'bool'

And in fact, a given parameter pack might be referenced more than one time in a given pack expansion pattern:

template <class ...Ts> struct A {
  template <class ...Us> struct B {
      using type1 = void ((void (*...fps)(Ts, Ts, Us)));
  };
};

Ie those two Ts are referencing the same argument within the pack, we can't confuse that with an expansion.

So think also about a case like the above, but where you are performing the expansion just one time. It doesn't look like to me you can figure that out from what Clang leaves behind in the AST at the moment.
We may need to create a new AST node, which is like a Subst variant of a PackExpansionType, at the expansion loci.

davrec added a comment.EditedAug 27 2022, 9:38 AM

Great examples.

Check this example out: https://godbolt.org/z/rsGsM6GrM

template <class ...Ts> struct A {
  template <class ...Us> struct B {
      using type1 = void ((void (*...fps)(Ts, Us)));
  };
};
using type2 = A<int, char>::B<short, bool>::type1;
TypeAliasDecl 0x55ffe8b45368 <line:8:1, col:45> col:7 type2 'A<int, char>::B<short, bool>::type1':'void ((void (*)(int, short), void (*)(char, bool)))'
  `-ElaboratedType 0x55ffe8b452f0 'A<int, char>::B<short, bool>::type1' sugar
    `-TypedefType 0x55ffe8b452d0 'A<int, char>::B<short, bool>::type1' sugar
      |-TypeAlias 0x55ffe8b45258 'type1'
      `-FunctionProtoType 0x55ffe8b451e0 'void ((void (*)(int, short), void (*)(char, bool)))' cdecl
        |-ParenType 0x55ffe8b12150 'void' sugar
        | `-BuiltinType 0x55ffe8ac6370 'void'
        |-PointerType 0x55ffe8b44550 'void (*)(int, short)'
        | `-ParenType 0x55ffe8b444f0 'void (int, short)' sugar
        |   `-FunctionProtoType 0x55ffe8b444b0 'void (int, short)' cdecl
        |     |-BuiltinType 0x55ffe8ac6370 'void'
        |     |-SubstTemplateTypeParmType 0x55ffe8b44310 'int' sugar
        |     | |-TemplateTypeParmType 0x55ffe8b11440 'Ts' dependent contains_unexpanded_pack depth 0 index 0 pack
        |     | | `-TemplateTypeParm 0x55ffe8b113c0 'Ts'
        |     | `-BuiltinType 0x55ffe8ac6410 'int'
        |     `-SubstTemplateTypeParmType 0x55ffe8b443c0 'short' sugar
        |       |-TemplateTypeParmType 0x55ffe8b11960 'Us' dependent contains_unexpanded_pack depth 1 index 0 pack
        |       | `-TemplateTypeParm 0x55ffe8b118d8 'Us'
        |       `-BuiltinType 0x55ffe8ac63f0 'short'
        `-PointerType 0x55ffe8b450c0 'void (*)(char, bool)'
          `-ParenType 0x55ffe8b45060 'void (char, bool)' sugar
            `-FunctionProtoType 0x55ffe8b447d0 'void (char, bool)' cdecl
              |-BuiltinType 0x55ffe8ac6370 'void'
              |-SubstTemplateTypeParmType 0x55ffe8b44630 'char' sugar
              | |-TemplateTypeParmType 0x55ffe8b11440 'Ts' dependent contains_unexpanded_pack depth 0 index 0 pack
              | | `-TemplateTypeParm 0x55ffe8b113c0 'Ts'
              | `-BuiltinType 0x55ffe8ac63b0 'char'
              `-SubstTemplateTypeParmType 0x55ffe8b446e0 'bool' sugar
                |-TemplateTypeParmType 0x55ffe8b11960 'Us' dependent contains_unexpanded_pack depth 1 index 0 pack
                | `-TemplateTypeParm 0x55ffe8b118d8 'Us'
                `-BuiltinType 0x55ffe8ac6390 'bool'

If it were as simple as the above case the Resugarer TreeTransform could say keep a map of TemplateTypeParmTypes to current pack indices, and iterate those during each each TransformSubstTemplateTypeParmType call, but...

And in fact, a given parameter pack might be referenced more than one time in a given pack expansion pattern:

template <class ...Ts> struct A {
  template <class ...Us> struct B {
      using type1 = void ((void (*...fps)(Ts, Ts, Us)));
  };
};

Ie those two Ts are referencing the same argument within the pack, we can't confuse that with an expansion.

So think also about a case like the above, but where you are performing the expansion just one time. It doesn't look like to me you can figure that out from what Clang leaves behind in the AST at the moment.

...the different Ts are the same TemplateTypeParmType, so that wouldn't work. You're right, more info is needed.

We may need to create a new AST node, which is like a Subst variant of a PackExpansionType, at the expansion loci.

Or, maybe TemplateTypeParmType could store a number that would make it unique for each *instance* of it in a given expansion? Or just SubstTemplateTypeParmType could store this number in addition to its TemplateTypeParmType? (E.g. the first Ts in an expansion is 0, the second Ts in the same expansion is 1, etc. - but it resets for the next expansion.)

Or just SubstTemplateTypeParmType could store this number in addition to its TemplateTypeParmType? (E.g. the first Ts in an expansion is 0, the second Ts in the same expansion is 1, etc. - but it resets for the next expansion.)

Well that number is just the pack_index, as implemented in this current patch :)

I like the idea obviously, this patch is so simple, and simple solutions are the best when they are good enough.

I am not completely sold that this solution is not good enough, it's hard to make the case that the reproducer is a reasonable program, but at the same time it's obvious that packs where designed to deal very cheaply with very large number of arguments.

And this patch does not make the worst case worse: If all arguments are different types in all expansions, then a different pack_index will not cause an extra uniquing because the underlying types will be different as well anyway.

Or just SubstTemplateTypeParmType could store this number in addition to its TemplateTypeParmType? (E.g. the first Ts in an expansion is 0, the second Ts in the same expansion is 1, etc. - but it resets for the next expansion.)

Well that number is just the pack_index, as implemented in this current patch :)

I meant that, in an of expansion (Ts, Ts, Us), either the two Ts would be two unique TemplateTypeParmTypes by adding an instanceIndex or whatever to refer to the two different syntactic instances (this would of course naturally result in two unique STTPTs), or for a bit more savings the TTPTs for the two Ts could continue to be the same but their STTPTs would include the instanceIndex, so there is only one unique TTPT but two unique STTPTs referring to Ts.

I.e. if there are 100 types in in Ts there would still only be two unique SubstTemplateTypeParmTypes created, as opposed to only 1 as there currently is, or 100 as this patch creates.

Any way it is done, the goal would be to make the STTPTs just as unique as necessary so the Resugarer can keep map from the unique STTPTs to its current pack index, and iterate the relevant index with each call to TransformSubstTemplateTypeParmType.

If I'm missing something and it is substantially more complex than that, then you're right that storing the pack index might end up the best solution - we don't want to e.g. have to do considerable Sema logic in the Resugarer just to infer the pack index.

Or just SubstTemplateTypeParmType could store this number in addition to its TemplateTypeParmType? (E.g. the first Ts in an expansion is 0, the second Ts in the same expansion is 1, etc. - but it resets for the next expansion.)

Well that number is just the pack_index, as implemented in this current patch :)

Disregard my previous comment, you're right, that would be identical. I'm mixing up my STTPTs and TTPTs :). So the proposed solution would be to make the TTPTs unique, and map from the TTPTs to their current pack indices in the Resugarer. But, that is probably identical in terms of memory usage as your proposal to introduce a new sugar type representing unique expansion instances, and the latter is probably clearer/less disruptive.

Thanks for your work on this, and for explaining these complexities.

If I'm missing something and it is substantially more complex than that, then you're right that storing the pack index might end up the best solution - we don't want to e.g. have to do considerable Sema logic in the Resugarer just to infer the pack index.

Having a new Subst-like node that wraps the whole expanded pattern seems to simplify the case where we need to take the type apart, such as deductions, leaving you without the 'birds-eye' view. Otherwise we would need to dig into the type to figure out what pack indexes we need to preserve in the resulting type.

Thanks for your work on this, and for explaining these complexities.

TBH I still don't know what to do here from the project perspective. I have a new deadline now that runs until November.

It seems that sitting down, figuring out a solution and coming up with a review-ready patch would take a considerable portion of that time. And then the review could take a year.

Meanwhile there is still considerable work remaining on the resugarer, in terms of addressing performance and getting everything into a reviewable state.

So I might try leaving this problem aside and not handle packs, for now. This would make us not able to resugar tuples for example, which is very unfortunate.

But in the unlikely case there was someone else interested in solving this problem, this could be parallelized.

nridge added a subscriber: nridge.Aug 27 2022, 9:27 PM
mizvekov reopened this revision.Aug 28 2022, 4:54 PM
This revision is now accepted and ready to land.Aug 28 2022, 4:54 PM
mizvekov planned changes to this revision.Aug 28 2022, 4:57 PM
mizvekov updated this revision to Diff 456220.
mizvekov retitled this revision from Clang: fix AST representation of expanded template arguments. to WIP: Clang: fix AST representation of expanded template arguments..
mizvekov edited the summary of this revision. (Show Details)

@davrec @alexfh

I finally managed to have a talk to @rsmith about this.

He thinks that, even if we can't come up with a better solution in time, the benefits of this patch justify the costs observed, as those substitution nodes are pretty useless without a way to index the pack, and having a rich AST is one of Clang's main goals.

But he also came up with this nifty idea that this high cost is perhaps coming because we are reindexing the whole pack every time that repro recurses on that disjunction and we consume the first element in the deduction.
And that a good solution for this might be to just index the pack backwards, so that 0 is the last element.

I will try that sometime soon. Even if I don't see an improvement, any further objections we just re-merge this?

PS: That does not disregard the other ideas to further improve this in the future. Another thing he noted that would be important is to also be able to represent in the resulting AST a pack that expanded 0 times.

davrec added a comment.Sep 8 2022, 9:11 AM

@davrec @alexfh

I finally managed to have a talk to @rsmith about this.

He thinks that, even if we can't come up with a better solution in time, the benefits of this patch justify the costs observed, as those substitution nodes are pretty useless without a way to index the pack, and having a rich AST is one of Clang's main goals.

I support that - re-merge for now, and consider ways to reduce costs going forward.

I thought this through every which way and could not avoid @mizvekov's conclusion, that the only other way would be another Subst* type node, from which the pack indices could be inferred, but only via non-straightforward code requiring a bird's eye view of the AST.

Taking a step back and reconsidering, it's clear the original way - just storing the pack index - is very much preferable to all that complexity. The benefits justify the costs for most users, and for clang developers who would be burdened by the appearance of an obscure new type class. And as I said previously I think broad flags governing how much sugar/non-semantic info is added to the AST, i.e. letting users balance benefits vs. costs for themselves, is the better way to handle these concerns.

mizvekov requested review of this revision.Sep 14 2022, 1:45 PM
mizvekov updated this revision to Diff 460206.
mizvekov retitled this revision from WIP: Clang: fix AST representation of expanded template arguments. to Clang: fix AST representation of expanded template arguments..
mizvekov edited the summary of this revision. (Show Details)
This revision is now accepted and ready to land.Sep 14 2022, 1:45 PM
mizvekov updated this revision to Diff 460595.Sep 15 2022, 6:10 PM
mizvekov updated this revision to Diff 460907.Sep 16 2022, 2:34 PM
mizvekov edited the summary of this revision. (Show Details)

The reverse indexing worked great for that problem, the patch has no perf impact now on that reproduction.cpp.

That doesn't mean it doesn't have any impact in general, or that it can't be further improved, but other solutions will take more time to develop and we can revisit this later.