This is an archive of the discontinued LLVM Phabricator instance.

DR39: Perform ambiguous subobject checks for class member access as part of object argument conversion, not as part of name lookup.
Needs ReviewPublic

Authored by rsmith on Jan 19 2021, 10:52 AM.

Details

Reviewers
rjmccall
Summary

Under C++ core issue 39, the rules for class-scope name lookup were reworked so
that name lookup no longer concerns itself with whether the names were found in
an ambiguous subobject. The checks for an ambiguous subobject are now done as
part of conversion of the object argument in a member access (if needed)
instead.

One other important consequence of the new lookup rule is that it's now OK to
find different lookup results in multiple different classes, so long as the
same set of entities is found in each case (in any order, perhaps with
duplicates, and in the type case, perhaps via unrelated typedef declarations).

This patch implements the new lookup rule. This also has some follow-on impact
on access checks: it's now OK for name lookup to find different member
declarations in different base classes so long as they all resolve to the same
set of entities, so we now need to compute what the corresponding found
declaration is, and access-check it, for each path independently.

Diff Detail

Event Timeline

rsmith requested review of this revision.Jan 19 2021, 10:52 AM
rsmith created this revision.
Herald added a project: Restricted Project. · View Herald TranscriptJan 19 2021, 10:52 AM

How does this new rule work if we find overlapping but non-equal sets of declarations in different subobjects? I'm sure you can get that with using declarations.

clang/include/clang/AST/CXXInheritance.h
77

Is there a way to know which case we're in, or do different consumers do different things?

(I didn't mean to submit the last comment before I finished the review, sorry)

It looks like you've chosen to treat this as a DR that we should apply under all standards. Have you investigated whether it causes compatibility problems? It does look like it'd be painful to try to support both rules.

clang/lib/Sema/SemaAccess.cpp
891

Is it okay to check them in sequence like this, or do we need to check in advance which case to use?

983

Is this not doing a *lot* of extra work? I suppose this is only in the slow path where we weren't able to immediately recognize that the found decl is public or we're in a scope with obviously adequate access?

clang/lib/Sema/SemaLookup.cpp
2296

I wonder under this new model if we can record that we saw something along ambiguous paths and avoid a lot of work in access-checking in the common case where we don't?

How does this new rule work if we find overlapping but non-equal sets of declarations in different subobjects? I'm sure you can get that with using declarations.

You can. The rule is that you resolve using-declarations to the declarations they name, and you resolve typedef declarations to the types they name, and the resulting set of declarations and types must be the same in every subobject in which there are (non-shadowed) results -- if not, the lookup is ill-formed. (Put another way, the set of entities found in each subobject must be the same, but the set of declarations found in each subobject can differ.)

clang/include/clang/AST/CXXInheritance.h
77

We could use a discriminated union here. Ultimately, I think what we really want is to provide some opaque storage for the consumer to use, rather than to have some hard-coded list here. I haven't thought of a good way to expose that; I don't think it's worth templating the entire base path finding mechanism to add such storage, and a single pointer isn't large enough for a DeclContext::lookup_result.

Vassil's https://reviews.llvm.org/D91524 will make the lookup_result case fit into a single pointer; once we adopt that, perhaps we can switch to holding an opaque void* here?

clang/lib/Sema/SemaAccess.cpp
891

For types, declaresSameEntity returns true for a subset of the cases for which declaresSameType returns true, so it will always be correct to check both (though for a type, checking declaresSameEntity will always be redundant).

983

Compared to the isDerivedFrom check, this will be doing one extra decl context hash table lookup per class we consider on each path. I haven't measured the extra work here; I can if you're concerned.

I thought for a while about better ways to handle this, and I'm not sure there's a good middle ground between redoing the work like this and saving the base paths from name lookup and using them here. If you have better ideas, I'd appreciate them!

I suppose we probably have a spare bit on DeclAccessPair that we could use to track whether we're in the "hard" (found in classes of different types) case, and we could use isDerivedFrom in the "easy" cases. Do you think that's worthwhile?

Ah, I see you suggest doing something like that below. Yes, I can give that a go.

You can. The rule is that you resolve using-declarations to the declarations they name, and you resolve typedef declarations to the types they name, and the resulting set of declarations and types must be the same in every subobject in which there are (non-shadowed) results -- if not, the lookup is ill-formed. (Put another way, the set of entities found in each subobject must be the same, but the set of declarations found in each subobject can differ.)

Okay, makes sense. So the DeclAccessPairs in the lookup result are basically just the first+best pair we found for any particular entity?

clang/include/clang/AST/CXXInheritance.h
77

Yeah, that might be cleaner, assuming we really need this. What are the clients that need to store something specifically in the CXXBasePath object and can't just keep it separate?

clang/lib/Sema/SemaAccess.cpp
983

Thanks. Yeah, I think it would be nice if the very common case of finding something in a unique class didn't have to redo any lookups.

You can. The rule is that you resolve using-declarations to the declarations they name, and you resolve typedef declarations to the types they name, and the resulting set of declarations and types must be the same in every subobject in which there are (non-shadowed) results -- if not, the lookup is ill-formed. (Put another way, the set of entities found in each subobject must be the same, but the set of declarations found in each subobject can differ.)

Okay, makes sense. So the DeclAccessPairs in the lookup result are basically just the first+best pair we found for any particular entity?

Yes, each pair comprises the declaration from the first subobject, with the best unprivileged access for the corresponding declarations in all subobjects.

clang/include/clang/AST/CXXInheritance.h
77

The main users of this are in name lookup proper (where we store the lookup results on the path to avoid redoing the class scope hash table lookup) and now in access checking (where we store the most-accessible declaration). I think there's a couple of other name-lookup-like clients that also store a lookup result here.

In all cases the side storage is just a convenience. But it's probably an important convenience; it's awkward for a client to maintain a mapping on the side, because the key for that map would end up being the entire base path.

We could just number the paths as we find them, and let the consumers build a SmallVector on the side rather than storing data in the CXXBasePaths. We'd need to be careful about the post-processing pass in lookupInBases that removes virtual base paths that are shadowed by non-virtual inheritance from the vbase, but that seems feasible. Worthwhile?

rjmccall added inline comments.Jan 26 2021, 10:43 AM
clang/include/clang/AST/CXXInheritance.h
77

I see what you're saying. We have places that compute all the base paths in an array, and it's convenient to be able to stash things into the elements of that array instead of having a second data structure that's parallel to it. I guess there are three options:

  • Tell clients to just make a parallel structure.
  • Have this sort of intrusive storage and just accept that clients that want to use it will have some awkward casting to do.
  • Get rid of the initial array dependence by making it a callback API instead of an array-building API, and then clients that want to build an array can build an array with the type they actually want.

I guess the third might be awkward?

rsmith added inline comments.Jan 27 2021, 12:33 PM
clang/include/clang/AST/CXXInheritance.h
77

Yes, the problem with the third option is its interaction with the post-processing step in CXXRecordDecl::lookupInBases. We're not supposed to consider virtual bases if they're shadowed along any path. The way that lookupInBases handles this is by first finding all paths to subobjects, and then removing any path containing an edge to a virtual base for which some other path ends in a class deriving from that vbase. So we end up with the callback being called spuriously in some cases, and lookupInBases ends up producing only a subset of the paths that were followed.

I'm pretty sure we could rearchitect the lookup process so that this doesn't happen -- so that we never visit a virtual base until we've finished traversing all subobjects that virtually inherit from it -- but that's not what the lookupInBases algorithm is doing right now.

rjmccall added inline comments.Jan 27 2021, 12:46 PM
clang/include/clang/AST/CXXInheritance.h
77

Okay. In the end, architecturally it's a little suspect that this is here, but I definitely don't think we need to hold up this patch to eliminate it; we can just flag that this is for arbitrary client use.