This is an archive of the discontinued LLVM Phabricator instance.

[clang] Fix 'operator=' lookup in nested class
Needs ReviewPublic

Authored by J-Camilleri on Apr 7 2023, 4:53 AM.

Details

Summary

If implicitly declared 'operator=' declarations are generated before
encountering a use in a nested class, custom definitions of the operator
in the nested class are not found.

The issue occurs because when searching for the operator in the nested
class the implicitly declared functions are found in the IdResolver and
returned. The IdResolver does not contain the custom defined operator
and so it is ignored.

Diff Detail

Event Timeline

J-Camilleri created this revision.Apr 7 2023, 4:53 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 7 2023, 4:53 AM
J-Camilleri requested review of this revision.Apr 7 2023, 4:53 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 7 2023, 4:53 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript

Firstly, If there are other reviewers which I should include instead please let me know, the developers that changed around this area did so a long time ago and I do not know who to add.

I have described the problem in the github issue (https://github.com/llvm/llvm-project/issues/59684) and as I said there I am uncertain about my solution.
Given the issue and my limited knowledge of the code base, I am thinking about three possible solutions to this issue.

  1. Have the user defined operator present in the IdResolver so that it is returned with the implicitly declared declarations.
    • I have seen cases where the declaration is kept in an FETokenInfo section but not added to the IdResolver, it gets removed at some point and leads to it not being present when CppLookupName is called
  1. Have the IdResolver declarations ordered by scope, going from most nested to least nested.
    • This suggestion stems from how the ommition of Outer::invokeAssign is handled.
    • Seems expensive to maintain.

I do not understand the role of the IdResolver well so I do not know if the suggestions above misuse it/ change it's intended role.

  1. Do not end the search at the IdResolver if we are searching for an 'operator='.
    • I went for this solution because it is simplest to implement and has the least changes.
    • This causes the LookupResult to temporary contain duplicate declarations, they are cleaned later by a call to LookupResult::resolveKind but it still feels like this should not happen. I guess we can skip the loop alltogether to not have duplications but I did not check to see if this is viable.

I guess my main issue with this is understanding the role of IdResolver, should the order to implicit declarations change what it returns, should it already contain the user defined operator declaration?

shafik added a subscriber: shafik.Apr 7 2023, 2:35 PM
shafik added inline comments.
clang/lib/Sema/SemaLookup.cpp
1328

Maybe we should also add an ND.isImplicit() check as well? I am not sure if this is the correct solution either.

rsmith added a comment.Apr 7 2023, 3:58 PM

What *should* happen here is that the lookup for operator= in Inner::invokeAssign should walk up the enclosing Scopes and DeclContexts, traversing both the results from the IdResolver and the results from DeclContext lookups. First, we should look in the scope corresponding to invokeAssign, where we find no results in the IdResolver (and there's no corresponding DeclContext so there's nowhere else to look). Then we should look in the scope corresponding to Inner, where again there's nothing in the IdResolver at that scope, but lookup in the DeclContext should find the member operator=. And so it shouldn't matter that the Outer scope has a lookup result in the IdContext, because that scope is never the one that we're doing a lookup within.

It sounds like the implicit lazy declaration of the operator= might be adding the name to the wrong scope, or perhaps the set of nested Scopes at the point where we parse Inner::invokeAssign is misconfigured. In Sema::DeclareImplicitCopyAssignment, we call getScopeForContext(ClassDecl) to find the right scope, and inject the name into that scope. I suspect that we could simply delete that entirely, given that any lookup for a class member name will also look into the DeclContext and doesn't need to find results in the Scope (in the IdResolver), but it still seems valuable to figure out why this isn't working. getScopeForContext finds the *innermost* scope corresponding to the given class, so if we had multiple enclosing scopes for the same class, we might get into some trouble, and it seems possible that that happens when reentering scopes when doing delayed parsing of class members. The scopes for delayed parsing are created by Parser::ReenterClassScopeRAII, and it looks like it's doing the right thing, but that's where I'd start looking.

rsmith added a comment.Apr 7 2023, 4:04 PM

Ah, I see.

  1. Have the IdResolver declarations ordered by scope, going from most nested to least nested.

This is the intended behavior. But Sema::PushOnScopeChains doesn't correctly handle declarations being added in non-lexical order, except for label declarations, which are special-cased.

I think probably the cleanest fix would be to add a flag to PushOnScopeChains to indicate that the scope to which we're adding the name may not be the innermost scope, and so we should find the correct insertion point (like we already do for LabelDecls). We should then set that flag when calling PushOnScopeChains with a scope that's not necessarily the innermost scope, such as from Sema::DeclareImplicit*.