Page MenuHomePhabricator

[clang] template / auto deduction deduces common sugar
Needs ReviewPublic

Authored by mizvekov on Oct 6 2021, 6:09 PM.

Details

Reviewers
rsmith
v.g.vassilev
Quuxplusone
aaronpuchert
sammccall
aaron.ballman
Group Reviewers
Restricted Project
Summary

This patch implements deducing the common sugar between two types,
and uses it to establish the result when checking deduced template arguments.

This carries over to auto return type deduction as well, which is reworked
to update the deduced type.

The common sugar is calculated by stripping off top-level sugar which differs
between types, crossing over and rebuilding non-sugar type nodes if needed.

Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>

Diff Detail

Event Timeline

mizvekov created this revision.Oct 6 2021, 6:09 PM
mizvekov published this revision for review.Oct 6 2021, 6:51 PM
Herald added a project: Restricted Project. · View Herald TranscriptOct 6 2021, 6:51 PM
Herald added a subscriber: cfe-commits. · View Herald Transcript
mizvekov updated this revision to Diff 378054.Oct 7 2021, 5:57 PM
  • Handle almost all type classes in getCommonSugar.
mizvekov updated this revision to Diff 378313.Oct 8 2021, 11:06 AM
mizvekov edited the summary of this revision. (Show Details)
  • Handle all type classes.
mizvekov updated this revision to Diff 378388.Oct 8 2021, 5:47 PM
mizvekov retitled this revision from [clang] WIP: template / auto deduction deduces common sugar to [clang] template / auto deduction deduces common sugar.
mizvekov edited the summary of this revision. (Show Details)
  • Add missing tests, not marked as WIP anymore.
mizvekov updated this revision to Diff 378406.Oct 8 2021, 8:20 PM
  • Tests and fix for dynamic exception spec corner case.
mizvekov updated this revision to Diff 378600.Oct 11 2021, 3:10 AM
  • Handle different top level qualifiers in function parameters.
  • More test cases.

Do you have examples showing what this does in practice for errors in real-world code? I'm concerned that stripping back to the common type sugar may produce an unintuitive result in some cases, although it certainly does seem at least theoretically nice to use a commutative and associative function to combine deductions like this. The kind of thing I'd be worried about would be (eg) a container where the iterator and const_iterator are the same, but where the common sugar of T::iterator and T::const_iterator are some internal type that the user has never heard of. In general it seems like this sugar stripping could result in types that are distant from the user's code. As an extreme example, in the case where one of the types is canonical and the other is not, it seems like stripping all the way back to the canonical type would be worse than ignoring the canonical version of the type and keeping the sugared version.

Perhaps we could treat canonical types as a special case (effectively assuming they came from taking a type and canonicalizing it at some point, rather than from a type the user wrote) and when combining two types, use the non-canonical type if one of them is canonical, and otherwise use the common sugar. That should still be commutative and associative and generally well-behaved, but won't degrade badly when given a canonical type as input. I think that still leaves the possibility of a surprising outcome when combining types that both desugar to some internal type, though I'm not sure what we can do about that other than make an arbitrary choice of which type sugar we like more.

I've not yet done any detailed review of the common type sugar computation mechanism.

clang/lib/Sema/SemaExprCXX.cpp
2008

Previously we distinguished between "auto deduction failed and already issued a diagnostic" and "auto deduction failed but no diagnostic has been issued". This allowed us to be sure we always issue exactly one diagnostic for each deduction failure. We seem to have lost that distinction -- TDK_MiscellaneousDeductionFailure does not indicate that a diagnostic has already been issued, and if an auto deduction fails for this reason, we now might produce no diagnostics at all.

clang/test/SemaCXX/sugared-auto.cpp
146

Why don't we get Virus as the deduced return type from line 143 here?

rsmith added inline comments.Oct 11 2021, 6:29 PM
clang/test/SemaCXX/sugared-auto.cpp
146

Oh, never mind, we've not updated the conditional expression handling to use getCommonSugar yet. We probably should -- it currently has a very minimal form of the same thing; see Sema::FindCompositePointerType. That can presumably be changed to use getCommonSugar once it strips down to a common type. On around SemaExprCXX.cpp:6870:

-  QualType Composite = Composite1;
+  QualType Composite = Context.getCommonSugar(Composite1, Composite2);

Do you have examples showing what this does in practice for errors in real-world code? I'm concerned that stripping back to the common type sugar may produce an unintuitive result in some cases, although it certainly does seem at least theoretically nice to use a commutative and associative function to combine deductions like this. The kind of thing I'd be worried about would be (eg) a container where the iterator and const_iterator are the same, but where the common sugar of T::iterator and T::const_iterator are some internal type that the user has never heard of. In general it seems like this sugar stripping could result in types that are distant from the user's code. As an extreme example, in the case where one of the types is canonical and the other is not, it seems like stripping all the way back to the canonical type would be worse than ignoring the canonical version of the type and keeping the sugared version.

I don't have handy access to very large code bases where this could be deployed experimentally, but this is something I can try to look into.

Perhaps we could treat canonical types as a special case (effectively assuming they came from taking a type and canonicalizing it at some point, rather than from a type the user wrote) and when combining two types, use the non-canonical type if one of them is canonical, and otherwise use the common sugar. That should still be commutative and associative and generally well-behaved, but won't degrade badly when given a canonical type as input. I think that still leaves the possibility of a surprising outcome when combining types that both desugar to some internal type, though I'm not sure what we can do about that other than make an arbitrary choice of which type sugar we like more.

The thing I would be concerned with this special case is that it would also give surprising results, ie it could deduce that Socrates is a Dog.
I guess there are two use cases scenarios here, one where we have this transparent hierarchy, and this algorithm gives results that make intuitive sense,
and the other where we have some typedef which we want to be opaque in order to not expose internal details.

So exploring your example, suppose we try to deduce from an iterator and a const_iterator.
We have some options here:

  • We deduce as either iterator or const_iterator. If there is an IS-A relationship between them, and we pick the right choice, then we pack up and go home, job well done. If there is no such relationship, neither answer seems right.
  • We deduce the canonical type, which might be something like struct vendor::detail::Foo *. This exposes internal details, but at least it has some vocabulary information, so you know this is a pointer to an object of some internal class. It's not good from a user friendliness PoV, but it's good from a 'I want to debug this thing' perspective.
  • We deduce to some type sugar which is meant to be internal, like vendor::detail::iterator_t. This is not very good, maybe it's worse from a user friendliness PoV than the bullet point above as we expose even more internal details. But maybe in some cases it's better as the vendor can also pick a name for the typedef which is more explanatory than the the canonical type, which will still be available in the 'aka', so from the debuggability perspective this also seems better.
  • We create some new type sugar which wraps the information that some type was deduced from two other types. This might be too much information overload, and we still need to have an answer to the 'single step desugar' of it, so I am not exploring this much further for now ;)

The problem of hiding internal details in error messages is a general problem, that maybe this solution would make a bit worse in some cases,
but that also means that solutions to the general problem can also remedy the problems we cause here.
One such idea (not proposing formally, just giving an example) would be an attribute on typedefs which hides the underlying sugar,
making it AS-IF the typedef was written on the canonical type. But it would still not hide the canonical type which is also
implementation detail, so not a huge win.

Going back to the 'treat canonical types as not written' workaround, I think there are too many cases where we are doing the wrong thing here in clang.
Basically any type written bare without any name / keyword qualifiers will not be treated by some ElaboratedType like mechanism. I suppose that as we fix
those problems, the need for this workaround will sort of disappear. I am not too opposed to it, but I think it might be better to give less, but more correct information, than
to some times make wild guesses ;-P

I've not yet done any detailed review of the common type sugar computation mechanism.

No worries, and thanks a ton for the feedback!!!

clang/test/SemaCXX/sugared-auto.cpp
146

Yeah, if you look into the next patch in the stack, which is still WIP, there is a change that fixes this aspect of this test case, but it's a different change than what you are proposing here. I will take a look again, but are you proposing that I add this sort of changes to this same patch, or keep things separate as I am trying to do?
Either answer is fine, less patches means less rebuild time for me :-)

mizvekov updated this revision to Diff 379179.EditedOct 12 2021, 2:05 PM
  • Introduce TDK_AlreadyDiagnosed.

Addresses rsmith's review comments.

mizvekov marked 3 inline comments as done.Oct 12 2021, 4:02 PM
mizvekov added inline comments.
clang/test/SemaCXX/sugared-auto.cpp
146

Do you have examples showing what this does in practice for errors in real-world code? I'm concerned that stripping back to the common type sugar may produce an unintuitive result in some cases, although it certainly does seem at least theoretically nice to use a commutative and associative function to combine deductions like this. The kind of thing I'd be worried about would be (eg) a container where the iterator and const_iterator are the same, but where the common sugar of T::iterator and T::const_iterator are some internal type that the user has never heard of. In general it seems like this sugar stripping could result in types that are distant from the user's code. As an extreme example, in the case where one of the types is canonical and the other is not, it seems like stripping all the way back to the canonical type would be worse than ignoring the canonical version of the type and keeping the sugared version.

FWIW, I believe in those cases the best tool might be some sort of attribute to treat an alias like a "strong typedef".

mizvekov edited the summary of this revision. (Show Details)Thu, Nov 11, 5:06 PM
mizvekov updated this revision to Diff 387404.Mon, Nov 15, 2:18 PM
  • Add libcxx/DELETE.ME to trigger libcxx CI.
Herald added a project: Restricted Project. · View Herald TranscriptMon, Nov 15, 2:18 PM
Herald added a reviewer: Restricted Project. · View Herald Transcript
mizvekov updated this revision to Diff 387554.Tue, Nov 16, 3:41 AM
  • Fix rebuilding Template Specializations