Page MenuHomePhabricator

Please use GitHub pull requests for new patches. Phabricator shutdown timeline

[clang] Track the templated entity in type substitution.
ClosedPublic

Authored by mizvekov on Aug 14 2022, 6:10 AM.

Details

Summary

This is a change to how we represent type subsitution in the AST.
Instead of only storing the replaced type, we track the templated
entity we are substituting, plus an index.
We modify MLTAL to track the templated entity at each level.

Otherwise, it's much more expensive to go from the template parameter back
to the templated entity, and not possible to do in some cases, as when
we instantiate outer templates, parameters might still reference the
original entity.

This also allows us to very cheaply lookup the templated entity we saw in
the naming context and find the corresponding argument it was replaced
from, such as for implementing template specialization resugaring.

Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
davrec added inline comments.Sep 26 2022, 6:45 AM
clang/include/clang/AST/Type.h
5109

@sammccall , drafting you as a representative for downstream AST users, do you have any objection to changing this return type to a TemplateTypeParmDecl from a TemplateTypeParmType? IIUC the change is not strictly necessary but is more for convenience, correct me if I'm wrong @mizvekov. But I'm inclined to think STTPT is not much used downstream so the churn will be minimal/acceptable, am I wrong?

So this patch at least puts that into consistency.

Is there already an analogous breakage for classes then?

I am not sure I follow the question, but I meant that classes were already deserealizing templates first, so no need to fix them.

Given @mizvekov's approaching deadline and @ChuanqiXu's limited availability, I'm inclined to think if @mizvekov can hack together a reasonable fix for this breakage, leaving a more rigorous/general solution for later as @ChuanqiXu suggests he would like to pursue, that would be sufficient to land this. Does anyone disagree?

Well I wouldn't say this fix is a hack any more than the whole code is, as it's already playing by the rules of the existing implementation.

We already have to deal with objects being accessed before they are fully deserialized, this is part of the design of the current solution.
The present patch just adds a, presumably new, case where the order is important.

Any changes in order to have a more strict, atomic deserialization are out of scope here.

By the way, I will let this patch sit for another week, as it will need some significant rebasing against a recent change that was merged, which could also possibly be reverted.

clang/include/clang/AST/Type.h
5109

I don't think this is strictly true.

We don't need to store the type, as that would increase storage for no benefit. The type won't have more usable information than the associated parameter declaration. The type can support a canonical representation, which we don't need.

We will store the template-like declaration + index, and simply access the parameter decl from there.
Otherwise creating a type from that requires the ASTContext and would possibly involve allocation.

clang/include/clang/Sema/Template.h
80

I would be against introducing another term to refer to the same thing...

So this patch at least puts that into consistency.

Is there already an analogous breakage for classes then?

I am not sure I follow the question, but I meant that classes were already deserealizing templates first, so no need to fix them.

Question was referring to @ChuanqiXu 's comment about this patch breaking modules for function importing, see his repro (which I haven't looked at closely). I was asking if this meant that classes were already broken in a similar way, since you said this patch brings them into consistency.

Given @mizvekov's approaching deadline and @ChuanqiXu's limited availability, I'm inclined to think if @mizvekov can hack together a reasonable fix for this breakage, leaving a more rigorous/general solution for later as @ChuanqiXu suggests he would like to pursue, that would be sufficient to land this. Does anyone disagree?

Well I wouldn't say this fix is a hack any more than the whole code is, as it's already playing by the rules of the existing implementation.

We already have to deal with objects being accessed before they are fully deserialized, this is part of the design of the current solution.
The present patch just adds a, presumably new, case where the order is important.

Any changes in order to have a more strict, atomic deserialization are out of scope here.

I think we misunderstand each other again, but are you saying the breakage found by @ChuanqiXu does not need to be fixed because it is out of scope? (If classes are already broken due to this issue, the argument might have some validity.)

I wouldn't feel comfortable accepting this without either a) a hack to fix the breakage or b) @ChuanqiXu saying it is okay. Maybe best to punt until @ChuanqiXu gets back from vacation in a couple weeks IIUC?

clang/include/clang/AST/Type.h
5109

I see, so it is a necessary change, and anyone who really depends on this returning the TTPT will need to convert it via Context.getTemplateTypeParmType(...). But any code written getReplacementParameter()->getDecl() would just lose the getDecl() and be fine. And since all the info is in decl most people will only need to do the latter. Given the necessity I think that's okay.

clang/include/clang/Sema/Template.h
80

The reason we need this unfortunately vague term "AssociatedDecl" in STTPT is because it can either be a template/template-like declaration *or* a TemplateTypeParmDecl. But here in MLTAL, it won't be a TTPD, will it? It will always be the parent template/template-like declaration, right? So there is no need for vagueness. ReplacedDecl or ParentTemplate or something like that seems more appropriate.

Question was referring to @ChuanqiXu 's comment about this patch breaking modules for function importing, see his repro (which I haven't looked at closely). I was asking if this meant that classes were already broken in a similar way, since you said this patch brings them into consistency.

No, I haven't had time to investigate that test case, as right now I am advancing in other parts of the stack.

But what I meant was that importing classes was already a case where the template parts could be accessed before the whole class was imported, and the existing code was already importing the template bits first to cope with that. The changes in this patch align the other cases to align with the need to import template parts first.

I think we misunderstand each other again, but are you saying the breakage found by @ChuanqiXu does not need to be fixed because it is out of scope? (If classes are already broken due to this issue, the argument might have some validity.)

No, fixing that breakage is in scope.

What I mean is out of scope is a major refactoring that would change how objects are deserialized, so that they are not accessed while they are part way there.
This, or a subset thereof, is what @ChuanqiXu was referring to, which would take about two months to implement.

mizvekov updated this revision to Diff 464937.Oct 4 2022, 3:26 AM

Sorry for the long delay, the size of this makes it tough to review, and I've been trying to make sure my concepts patch didn't get reverted. This generally looks good to me, however, I really don't like the number of bits to represent 'template parameter' being inconsistent between bitfields in general. I'm reasonably OK limiting it, but having 'surprise' limits only when we deal with replacements isn't really acceptable. When we hit these limits, we should fail early.

clang/include/clang/AST/Type.h
1836

Is it a problem for this to be a different number of bits used to represent the number of template parameters? I would expect we would want to make them all them the same size (else we have an additional limit on the size of parameters).

1858

Did the testing here result in finding anyone who used this? I'm sure libcxx or some of the boost libraries do a lot of MP on large sizes (though I note libcxx seems to have passed pre-commit).

clang/include/clang/Sema/Template.h
104–105

2 parameters with no default means this doesn't have to be explicit anymore.

mizvekov updated this revision to Diff 466320.Oct 8 2022, 5:28 PM
mizvekov marked 12 inline comments as done.
Herald added a project: Restricted Project. · View Herald TranscriptOct 8 2022, 5:28 PM
mizvekov added inline comments.Oct 8 2022, 5:29 PM
clang/include/clang/AST/Type.h
1836

Per documentation reasons, it would certainly be simpler if they were the same size.

But they are not the same thing. Index is for indexing into arguments bound to non-pack parameters,
while PackIndex is for indexing arguments within a pack.

The only way Index could reach such huge values, would be if there were a corresponding huge amount of template parameters. For a pack that is not the case.

So packs are much more efficient for handling large amounts of arguments, so it makes sense to have a higher limit for them.

But how much that should be in practice is hard to tell.

I just can't see that 2^15 would not be enough for Index.

For PackIndex, 2^16 seems reasonable to me, but in the worst case someone complains, we can just store a larger PackIndex in a tail allocated field, while we keep storing small ones in the TypeBitfields.

1858

Not on anything LLVM pre-commit as you saw.

I am not sure how much we should be concerned about here.
This is still much larger than implimits. If this hits anyone, we will just need to add a test case for a reasonable limit and extend this node to hold large values in tail-allocated extra fields.

clang/include/clang/Sema/Template.h
80

No, it can be the TTPD which is used to represent the invented template for a requires substitution.

clang/lib/Serialization/ASTWriterDecl.cpp
622–625

This one was actually because merging of FunctionTemplateDecls was a bit more complicated, I made changes so that we always merge them from the FunctionDecl side, and it's working now.

erichkeane accepted this revision.Oct 10 2022, 7:19 AM

This LGTM, but please give @davrec time (a few days?) to do another review before merging.

ChuanqiXu added inline comments.Oct 10 2022, 8:38 PM
clang/lib/Serialization/ASTWriterDecl.cpp
622–625

I see. Thanks for the working! My work to defer reading types goes slowly. So it may not make sense to block this. But please give me a week to review the related changes.

davrec accepted this revision.Oct 11 2022, 3:20 PM

Looks good, over to @ChuanqiXu

clang/include/clang/Sema/Template.h
80

K, makes sense to keep it the same then

ChuanqiXu accepted this revision.Oct 13 2022, 8:44 PM

The codes look good and also the tests pass. So LGTM. Thanks!

clang/lib/Serialization/ASTReaderDecl.cpp
1082–1094

nit: I feel it lack a comment.

mizvekov updated this revision to Diff 467918.Oct 14 2022, 2:31 PM
mizvekov marked 4 inline comments as done.

I pushed some late changes:

  • Implement this new scheme uniformly across all template parameter kinds. Before we were doing only type parameters, now we are doing non-type and template template parameters. While the diff is not very small, it's more of the same stuff.
  • Fixed passing the wrong decl for variable template partial specialization.
  • Relaxed assertion on Subst* construction. We previously checked we could access the parameter, but lldb testing showed some cases in the ASTImporter where the parameter might not have been imported yet. This is otherwise fine as it will be available before first use.

I like the late changes, just need to add comments to the public methods, and maybe move getReplacedTemplateParameterList over to Decl.

clang/include/clang/AST/DeclTemplate.h
3427

I don't object with making this public, and I can see the argument for making this its own function rather than a Decl method all things being equal, but given that we already have Decl::getDescribedTemplateParams, I think that

  1. this should probably also go in Decl,
  2. a good comment is needed explaining the difference between the two (e.g. that the D is the template/template-like decl itself, not a templated decl as required by getDescribedTemplateParams, or whatever), and what happens when it is called on a non-template decl, and
  3. it probably should be named just getTemplateParams or something that suggests its difference with getDescribedTemplateParams.
clang/include/clang/AST/ExprCXX.h
4309–4320

Add comment

4325

(Not your responsibility but while you're adding comments maybe you could add one for this too.)

4382–4384

Add comments

clang/include/clang/AST/TemplateName.h
153–154

Add comments

391–406

Add comments

clang/lib/AST/DeclTemplate.cpp
1610

Now that this is public, probably should return nullptr - otherwise users need the same huge switch just to decide whether they can call this function.

davrec added inline comments.Oct 15 2022, 5:52 AM
clang/include/clang/AST/DeclTemplate.h
3427

On second thought this should definitely be a Decl method called getTemplateParameters(), since all it does is dispatches to the derived class's implementation of that.

mizvekov added inline comments.Oct 15 2022, 8:20 AM
clang/include/clang/AST/DeclTemplate.h
3427

I think this function shouldn't be public in that sense, it should only be used for implementation of retrieving parameter for Subst* nodes.

I don't think this should try to handle any other kinds of decls which can't appear as the AssociatedDecl in a Subst node.

There will be Subst specific changes here in the future, but which depend on some other enablers:

  • We need to make Subst nodes for variable templates store the SpecializationDecl instead of the TemplateDecl, but this requires changing the order between creating the specialization and performing the substitution. I will do that in a separate patch.
  • We will stop creating Subst* nodes for things we already performed substitution with sugar. And so, we won't need to handle alias templates, conceptdecls, etc. Subst nodes will only be used for things which we track specializations for, and that need resugaring.

It's only made 'public' because now we are using it across far away places like Type.cpp and ExprCXX.cpp and TemplateName.cpp.

I didn't think this needs to go as a Decl member, because of limited scope and because it only ever needs to access public members.
I don't think this justifies either a separate header to be included only where it's needed.

One option is to further make the name more specific, by adding Internal to the name for example.
Any other ideas?

mizvekov updated this revision to Diff 468026.Oct 15 2022, 9:54 AM
mizvekov marked 3 inline comments as done.
mizvekov marked an inline comment as done.Oct 15 2022, 9:55 AM
mizvekov added inline comments.
clang/include/clang/AST/DeclTemplate.h
3427

I ended up simply documenting that this is an internal function, for now.

davrec added inline comments.Oct 15 2022, 11:36 AM
clang/include/clang/AST/DeclTemplate.h
3427

That works for me.

This revision was not accepted when it landed; it landed in state Needs Review.Oct 15 2022, 1:09 PM
This revision was automatically updated to reflect the committed changes.
mizvekov marked 4 inline comments as done.Oct 15 2022, 1:09 PM

Heads-up - I'm seeing the compilation failure that reduces to this commit. I didn't get a reproducer of the reasonable size yet :(

The problem occurs when using modules only.

The module has code like:

template <typename T>
class A {
 public:
  T x;
  template <typename H>
  friend H Helper(H h, const A& a) {
    return H::foo(std::move(h), a.x);
  }
};

Then, when the module client uses Helper, we get an error:

header.h:46:12: error: inline function 'Helper<B>' is not defined [-Werror,-Wundefined-inline]

If I move Helper definition out of friend declaration:

template <typename T>
class A {
 public:
  T x;
  template <typename H>
  friend H Helper(H h, const A& a);
};

template <typename H, typename T>
H Helper(H h, const A<T>& a) {
  return H::foo(std::move(h), a.x);
}

things compile ok.

Without modules, both versions compile ok.

Hi @mizvekov.

This change has caused a failure in Clang's stage 2 CI on the green dragon Darwin CI: https://green.lab.llvm.org/green/job/clang-stage2-Rthinlto/6390/console.

Assertion failed: (lvaluePath->getType() == elemTy && "Unexpected type reference!"), function readAPValue, file /Users/buildslave/jenkins/workspace/clang-stage1-RA/clang-build/tools/clang/include/clang/AST/AbstractBasicReader.inc, line 736.
Assertion failed: (BlockScope.empty() && CurAbbrevs.empty() && "Block imbalance"), function ~BitstreamWriter, file /Users/buildslave/jenkins/workspace/clang-stage1-RA/llvm-project/llvm/include/llvm/Bitstream/BitstreamWriter.h, line 119.

Full log is here - https://green.lab.llvm.org/green/job/clang-stage2-Rthinlto/6390/console.

This happens when a Release+Asserts stage 1 clang is building stage 2 clang when Clang modules are enabled. You should be able to reproduce this by building Release+Asserts clang, and then using it to build clang again with the LLVM_ENABLE_MODULES cmake option.

Could you please look into this as soon as you can?

Ping @mizvekov.

Unfortunately I'm unable to revert this commit now so we won't be able to get the bot back to green until it's fixed.

Ping @mizvekov.

Unfortunately I'm unable to revert this commit now so we won't be able to get the bot back to green until it's fixed.

Hello, sorry for the long time to respond.

Can you tell if this has been fixed by https://github.com/llvm/llvm-project/commit/303f20a2cab4554a10ec225fd853709146f8c51c ?

> Can you tell if this has been fixed by https://github.com/llvm/llvm-project/commit/303f20a2cab4554a10ec225fd853709146f8c51c ?

I can still reproduce this with latest commit. It is not fixed. At least it can easily reproduce on Darwin platform with bootstrap (stage 2 with module enabled).

Looking at the debugger output:

(lldb) p lvaluePath->getType().dump()
FunctionProtoType 0x1424badb0 'void (const unsigned long long &, llvm::raw_ostream &, StringRef)' imported cdecl
|-BuiltinType 0x12886e870 'void'
|-LValueReferenceType 0x1424bad80 'const unsigned long long &' imported
| `-QualType 0x1424bad21 'const unsigned long long' const
|   `-SubstTemplateTypeParmType 0x1424bad20 'unsigned long long' sugar imported typename depth 0 index 0 T
|     |-ClassTemplateSpecialization 0x1424bab70 'format_provider'
|     `-BuiltinType 0x12886e9f0 'unsigned long long'
|-LValueReferenceType 0x13926fd00 'llvm::raw_ostream &' imported
| `-ElaboratedType 0x13926fcd0 'llvm::raw_ostream' sugar imported
|   `-RecordType 0x128c396a0 'class llvm::raw_ostream' imported
|     `-CXXRecord 0x128c39758 'raw_ostream'
`-ElaboratedType 0x128bf6c20 'StringRef' sugar imported
  `-RecordType 0x128bf6c00 'class llvm::StringRef' imported
    `-CXXRecord 0x128c02570 'StringRef'

(lldb) p elemTy.dump()
FunctionProtoType 0x1392702d0 'void (const unsigned long long &, llvm::raw_ostream &, StringRef)' cdecl
|-BuiltinType 0x12886e870 'void'
|-LValueReferenceType 0x139270140 'const unsigned long long &'
| `-QualType 0x139270111 'const unsigned long long' const
|   `-SubstTemplateTypeParmType 0x139270110 'unsigned long long' sugar typename depth 0 index 0 T
|     |-ClassTemplateSpecialization 0x13924aed8 'format_provider'
|     `-BuiltinType 0x12886e9f0 'unsigned long long'
|-LValueReferenceType 0x13926fd00 'llvm::raw_ostream &' imported
| `-ElaboratedType 0x13926fcd0 'llvm::raw_ostream' sugar imported
|   `-RecordType 0x128c396a0 'class llvm::raw_ostream' imported
|     `-CXXRecord 0x128c39758 'raw_ostream'
`-ElaboratedType 0x128bf6c20 'StringRef' sugar imported
  `-RecordType 0x128bf6c00 'class llvm::StringRef' imported
    `-CXXRecord 0x128c02570 'StringRef'
steven_wu added a comment.EditedDec 14 2022, 11:19 AM

The workaround in https://reviews.llvm.org/D139956 can get llvm-project bootstrap correctly, doesn't fix the underlying issue.

The new failures is in building SupportTests where it issues a weird warning followed by missing symbol. It might be related to this commit too:
https://green.lab.llvm.org/green/job/clang-stage2-Rthinlto/6488/

In module 'LLVM_Utils' imported from /Users/buildslave/jenkins/workspace/clang-stage2-Rthinlto/llvm-project/llvm/unittests/Support/LinearPolyBaseTest.cpp:9:
/Users/buildslave/jenkins/workspace/clang-stage2-Rthinlto/llvm-project/llvm/include/llvm/Support/TypeSize.h:215:3: warning: inline function 'llvm::operator-<long long>' is not defined [-Wundefined-inline]
  operator-(const LeafTy &LHS) {
  ^
/Users/buildslave/jenkins/workspace/clang-stage2-Rthinlto/llvm-project/llvm/unittests/Support/LinearPolyBaseTest.cpp:173:13: note: used here
  EXPECT_EQ(-Univariate3D(4, 0), Univariate3D(-4, 0));
            ^
1 warning generated.
Undefined symbols for architecture x86_64:
  "std::__1::enable_if<std::is_signed<long long>::value, Univariate3D>::type llvm::operator-<long long>(Univariate3D const&)", referenced from:
      UnivariateLinearPolyBase_Univariate3D_Invert_Test::TestBody() in lto.o
ld: symbol(s) not found for architecture x86_64

Note that the Clang 16 branch is coming up approximately on January 24, and this needs to be reverted or have the issue reported by @steven_wu fixed by then. @mizvekov : If you or someone else don't have a solution/revert to this by January 13th, @aaron.ballman and I will begin the process to revert this ourselves.

@mizvekov : I'm looking at the revert of this since no work seems to have been done to fix @steven_wu 's problem. Unfortunately, it seems that reverting this requires reverting https://reviews.llvm.org/D134604 plus all of the "Instantiate" patches, which undoes basically all of your work.

I'm starting it now, but its unfortunate that we are going to lose this much of your work.

@mizvekov : I'm looking at the revert of this since no work seems to have been done to fix @steven_wu 's problem. Unfortunately, it seems that reverting this requires reverting https://reviews.llvm.org/D134604 plus all of the "Instantiate" patches, which undoes basically all of your work.

I'm starting it now, but its unfortunate that we are going to lose this much of your work.

That is *really* unfortunate. @mizvekov (@erichkeane) is there any other way out here?

@mizvekov : I'm looking at the revert of this since no work seems to have been done to fix @steven_wu 's problem. Unfortunately, it seems that reverting this requires reverting https://reviews.llvm.org/D134604 plus all of the "Instantiate" patches, which undoes basically all of your work.

I'm starting it now, but its unfortunate that we are going to lose this much of your work.

That is *really* unfortunate. @mizvekov (@erichkeane) is there any other way out here?

@mizvekov said he was working on fixing the issue, and we gave him a deadline of Tomorrow for it. If he (or someone else) manages to fix the problem by Tuesday early AM PT, I won't have to do that. BUT we cannot release with this regression.

This change has caused a failure in Clang's stage 2 CI on the green dragon Darwin CI: https://green.lab.llvm.org/green/job/clang-stage2-Rthinlto/6390/console.

Assertion failed: (lvaluePath->getType() == elemTy && "Unexpected type reference!"), function readAPValue, file /Users/buildslave/jenkins/workspace/clang-stage1-RA/clang-build/tools/clang/include/clang/AST/AbstractBasicReader.inc, line 736.

This assert is simply wrong, and I've removed it in rG2009f2450532450a99c1a03d5e2c30f478121839 -- that change should be safe to cherry-pick into the release branch. It's possible for the recomputation of the type after deserialization to result in a different type than what we saw when serializing, because redeclarations of the same entity can use the same type with different sugar -- or even slightly different types in some cases, such as when an array bound is added in a redeclaration. The dumps of the types provided by @steven_wu confirms that we were just seeing a difference in type sugar in this case.

Assertion failed: (BlockScope.empty() && CurAbbrevs.empty() && "Block imbalance"), function ~BitstreamWriter, file /Users/buildslave/jenkins/workspace/clang-stage1-RA/llvm-project/llvm/include/llvm/Bitstream/BitstreamWriter.h, line 119.

Is this still happening? If so, this looks more serious, and will need further investigation.

Can we undo the workaround in https://reviews.llvm.org/D139956 and see if the bot is now happy? Or can someone who was seeing problems before (@steven_wu?) run a test?

This change has caused a failure in Clang's stage 2 CI on the green dragon Darwin CI: https://green.lab.llvm.org/green/job/clang-stage2-Rthinlto/6390/console.

Assertion failed: (lvaluePath->getType() == elemTy && "Unexpected type reference!"), function readAPValue, file /Users/buildslave/jenkins/workspace/clang-stage1-RA/clang-build/tools/clang/include/clang/AST/AbstractBasicReader.inc, line 736.

This assert is simply wrong, and I've removed it in rG2009f2450532450a99c1a03d5e2c30f478121839 -- that change should be safe to cherry-pick into the release branch. It's possible for the recomputation of the type after deserialization to result in a different type than what we saw when serializing, because redeclarations of the same entity can use the same type with different sugar -- or even slightly different types in some cases, such as when an array bound is added in a redeclaration. The dumps of the types provided by @steven_wu confirms that we were just seeing a difference in type sugar in this case.

Assertion failed: (BlockScope.empty() && CurAbbrevs.empty() && "Block imbalance"), function ~BitstreamWriter, file /Users/buildslave/jenkins/workspace/clang-stage1-RA/llvm-project/llvm/include/llvm/Bitstream/BitstreamWriter.h, line 119.

Is this still happening? If so, this looks more serious, and will need further investigation.

Can we undo the workaround in https://reviews.llvm.org/D139956 and see if the bot is now happy? Or can someone who was seeing problems before (@steven_wu?) run a test?

Thank you for poking at this Richard! However, I think we still need to revert the functionality in this area unless we're able to make headway on https://github.com/llvm/llvm-project/issues/59271 and quickly. FWIW, I ran into this exact problem yesterday on my dev machine, so the overhead is still a present concern. If that's something you plan to work on, then I think it'd make sense for Erich to hold off on starting the revert work to give you a chance to improve this. But if nobody is actively working on it, we need to start pulling this back because the branch date is a bit over a week away (Jan 24).

This change has caused a failure in Clang's stage 2 CI on the green dragon Darwin CI: https://green.lab.llvm.org/green/job/clang-stage2-Rthinlto/6390/console.

Assertion failed: (lvaluePath->getType() == elemTy && "Unexpected type reference!"), function readAPValue, file /Users/buildslave/jenkins/workspace/clang-stage1-RA/clang-build/tools/clang/include/clang/AST/AbstractBasicReader.inc, line 736.

This assert is simply wrong, and I've removed it in rG2009f2450532450a99c1a03d5e2c30f478121839 -- that change should be safe to cherry-pick into the release branch. It's possible for the recomputation of the type after deserialization to result in a different type than what we saw when serializing, because redeclarations of the same entity can use the same type with different sugar -- or even slightly different types in some cases, such as when an array bound is added in a redeclaration. The dumps of the types provided by @steven_wu confirms that we were just seeing a difference in type sugar in this case.

Assertion failed: (BlockScope.empty() && CurAbbrevs.empty() && "Block imbalance"), function ~BitstreamWriter, file /Users/buildslave/jenkins/workspace/clang-stage1-RA/llvm-project/llvm/include/llvm/Bitstream/BitstreamWriter.h, line 119.

Is this still happening? If so, this looks more serious, and will need further investigation.

Can we undo the workaround in https://reviews.llvm.org/D139956 and see if the bot is now happy? Or can someone who was seeing problems before (@steven_wu?) run a test?

Thank you for poking at this Richard! However, I think we still need to revert the functionality in this area unless we're able to make headway on https://github.com/llvm/llvm-project/issues/59271 and quickly. FWIW, I ran into this exact problem yesterday on my dev machine, so the overhead is still a present concern. If that's something you plan to work on, then I think it'd make sense for Erich to hold off on starting the revert work to give you a chance to improve this. But if nobody is actively working on it, we need to start pulling this back because the branch date is a bit over a week away (Jan 24).

My understanding is that the submitter of that bug did sufficient analysis to determine that https://reviews.llvm.org/D136566 is the cause of his perf regression, having done an analysis the patch before and after. The only reason to revert THIS patch (and the follow-ups, since this is a 'base patch' to the rest) is the report by @steven_wu .

SO, @steven_wu: Can ypu please, ASAP, try to reproduce your issue as Richard asked above? IF so, we only have to revert D136566, which should fix our performance issue. (that is, revert the workaround you submitted in https://reviews.llvm.org/D139956, then see if it works?).

@arphaman I see you're part of the original report too, if you could try to revert the workaround and test it, it would be nice too!

This change has caused a failure in Clang's stage 2 CI on the green dragon Darwin CI: https://green.lab.llvm.org/green/job/clang-stage2-Rthinlto/6390/console.

Assertion failed: (lvaluePath->getType() == elemTy && "Unexpected type reference!"), function readAPValue, file /Users/buildslave/jenkins/workspace/clang-stage1-RA/clang-build/tools/clang/include/clang/AST/AbstractBasicReader.inc, line 736.

This assert is simply wrong, and I've removed it in rG2009f2450532450a99c1a03d5e2c30f478121839 -- that change should be safe to cherry-pick into the release branch. It's possible for the recomputation of the type after deserialization to result in a different type than what we saw when serializing, because redeclarations of the same entity can use the same type with different sugar -- or even slightly different types in some cases, such as when an array bound is added in a redeclaration. The dumps of the types provided by @steven_wu confirms that we were just seeing a difference in type sugar in this case.

Assertion failed: (BlockScope.empty() && CurAbbrevs.empty() && "Block imbalance"), function ~BitstreamWriter, file /Users/buildslave/jenkins/workspace/clang-stage1-RA/llvm-project/llvm/include/llvm/Bitstream/BitstreamWriter.h, line 119.

Is this still happening? If so, this looks more serious, and will need further investigation.

Can we undo the workaround in https://reviews.llvm.org/D139956 and see if the bot is now happy? Or can someone who was seeing problems before (@steven_wu?) run a test?

Thank you for poking at this Richard! However, I think we still need to revert the functionality in this area unless we're able to make headway on https://github.com/llvm/llvm-project/issues/59271 and quickly. FWIW, I ran into this exact problem yesterday on my dev machine, so the overhead is still a present concern. If that's something you plan to work on, then I think it'd make sense for Erich to hold off on starting the revert work to give you a chance to improve this. But if nobody is actively working on it, we need to start pulling this back because the branch date is a bit over a week away (Jan 24).

My understanding is that the submitter of that bug did sufficient analysis to determine that https://reviews.llvm.org/D136566 is the cause of his perf regression, having done an analysis the patch before and after. The only reason to revert THIS patch (and the follow-ups, since this is a 'base patch' to the rest) is the report by @steven_wu .

Ahhhh thank you for the extra information. I had missed that this was one before the perf issue.

SO, @steven_wu: Can ypu please, ASAP, try to reproduce your issue as Richard asked above? IF so, we only have to revert D136566, which should fix our performance issue. (that is, revert the workaround you submitted in https://reviews.llvm.org/D139956, then see if it works?).

+1, but based on the link to the workaround and what Richard fixed, I'm optimistic we can keep this patch.

Note, I just did the revert of the workaround myself here: 498bf3424d011235d420e02e80e135fae646d537

I won't see the failure on https://green.lab.llvm.org/green/job/clang-stage2-Rthinlto/ until Tuesday, but if it DOES fail like reported above, I'll continue on the effort to revert this patch.

Thanks a lot @rsmith for providing a fix and thanks a lot @aaron.ballman and @erichkeane for the efforts saving @mizvekov work over the summer. I believe he has sporadic access to internet and soon he will be back to normal. Great example of team work here!!

Thanks a lot @rsmith for providing a fix and thanks a lot @aaron.ballman and @erichkeane for the efforts saving @mizvekov work over the summer. I believe he has sporadic access to internet and soon he will be back to normal. Great example of team work here!!

Note we don't yet have the evidence that this is savable yet, we should know in the next day or so. At minimum, at least 1 of his patches needs to be reverted due to the memory regression.

Thanks a lot @rsmith for providing a fix and thanks a lot @aaron.ballman and @erichkeane for the efforts saving @mizvekov work over the summer. I believe he has sporadic access to internet and soon he will be back to normal. Great example of team work here!!

Note we don't yet have the evidence that this is savable yet, we should know in the next day or so. At minimum, at least 1 of his patches needs to be reverted due to the memory regression.

I see that https://green.lab.llvm.org/green/job/clang-stage2-Rthinlto/6584/ picked up your revert and seems successful. Then the next build is green as well and then starts failing for a reason that’s unrelated to this patch.

Thanks a lot @rsmith for providing a fix and thanks a lot @aaron.ballman and @erichkeane for the efforts saving @mizvekov work over the summer. I believe he has sporadic access to internet and soon he will be back to normal. Great example of team work here!!

Note we don't yet have the evidence that this is savable yet, we should know in the next day or so. At minimum, at least 1 of his patches needs to be reverted due to the memory regression.

I see that https://green.lab.llvm.org/green/job/clang-stage2-Rthinlto/6584/ picked up your revert and seems successful. Then the next build is green as well and then starts failing for a reason that’s unrelated to this patch.

Yep! So only 1 of the patches (D136566 needs reverting/further attention).

Thanks a lot @rsmith for providing a fix and thanks a lot @aaron.ballman and @erichkeane for the efforts saving @mizvekov work over the summer. I believe he has sporadic access to internet and soon he will be back to normal. Great example of team work here!!

Note we don't yet have the evidence that this is savable yet, we should know in the next day or so. At minimum, at least 1 of his patches needs to be reverted due to the memory regression.

I see that https://green.lab.llvm.org/green/job/clang-stage2-Rthinlto/6584/ picked up your revert and seems successful. Then the next build is green as well and then starts failing for a reason that’s unrelated to this patch.

Yep! So only 1 of the patches (D136566 needs reverting/further attention).

Awesome!!