This is an archive of the discontinued LLVM Phabricator instance.

[clang][RecoveryExpr] Clarify the dependence-bits documentation.
ClosedPublic

Authored by hokein on Jul 6 2020, 4:50 AM.

Details

Summary

The expr dependent-bits described that the expression somehow depends on a template paramter.

With RecoveryExpr, we have generalized it to "the expression depends on a template parameter, or an error". This patch updates/cleanups all related comments of dependence-bits.

Diff Detail

Event Timeline

hokein created this revision.Jul 6 2020, 4:50 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 6 2020, 4:50 AM

What's the rationale/motivation for this? I'd expect this to break a lot of subtle assumptions, because the invariant "nothing can be dependent without being instantiation-dependent" is a useful one.

It also doesn't seem correct from first principles. My mental model is:

  • dependence describes code where some properties are unknown because a template wasn't instantiated yet
  • we've generalized this to "some properties are unknown because we don't know what this piece of code is supposed to say"
  • we can never know the value at compile-time even in contexts where we should, so RecoveryExpr is value-dependent
  • we only know the type at compile-time if the RecoveryExpr's type is known and not itself dependent, so RecoveryExpr is usually value-dependent
  • the code may intend to refer to entities valid only in certain instantiations (consider (void)f(T()) where f is an overloaded function). So RecoveryExpr is at least sometimes instantiation-dependent

At best I think we could say that the expr is only instantiation-dependent if it occurs in a template scope - otherwise there's no template we might fail to instantiate.

hokein added a comment.Jul 9 2020, 4:08 AM

I think this depends on how we interpret the instantiation-dependent bit. In clang, it currently has two semantics:

  1. an expression (somehow) depends on a template parameter;
  2. an expression is dependent;

Prior to RecoveryExpression being added, 1 and 2 are equal, so we use isInstantiationDependent to check whether the code is dependent. This is fine.

However since we now generalize the "dependent" concept to mean "depends on a template parameter, or an error", checking isInstantiationDependent() is incomplete to determine the code is dependent.

what we could do?

Option 1 (what this patch does):

instantiation-dependence still implies the expression involves a template parameter -- if a recovery expression doesn't involves a template parameter, we don't set this flag

pros:

  • avoid unnecessary instantiations if an AST node doesn't depend on template parameters but contain errorss;
  • more precise semantics (maybe for better error-tracking, diagnostics), we can express a recovery-expr involves a template parameter (contains-errors + instantiation-dep), or not (contains-erros + no instantiation-dep)

cons:

  • as you mentioned above, we violate the "nothing can be dependent if it is not instantiation dependent" assumption, which may leads us to fix a long tail of bugs (high effort), and the fix is basically adding containsErrors to where isInstantiaionDependent is used.

This makes me feel like we're going back to "use containsErrors everywhere" solution. Not sure what pros would buy us, maybe it is not worth the effort.

Option 2:

Keep it as it-is now, always set it, then the instantiation-dependent would mean "an expression depends on a template parameter or an error".

pros:

  • everything would work without changing anything!

cons:

  • keep using the instantiation name is confusing; we could do clarify the document of isInstantiation

While writing this, I'm starting to like the option2, any other thoughts?

I think this depends on how we interpret the instantiation-dependent bit. In clang, it currently has two semantics:

  1. an expression (somehow) depends on a template parameter;
  2. an expression is dependent;

Not trying to be difficult, but I'm not sure what you mean by "dependent" in #2. Informally, it means the same thing as #1. Formally, C++ defines type-dependence and value-dependence for expressions, and the ABI defines instantiation-dependence for expressions - AFAIK "dependent" doesn't have a formal meaning.

So I'd rather say it has two definitions:

  • informally, it (somehow) depends on a template parameter.
  • formally (from the C++ABI), an expression is instantiation-dependent if it is type-dependent or value-dependent, or it has a subexpression that is type-dependent or value-dependent.

To preserve the linkage between these two, IMO we need to extend the formal definition to errors if and only if we extend the informal definition to errors. Accidentally unlinking the two is almost certain to result in subtle bugs as either the implementation subtly differs from the spec, or it subtly differs from the mental model.
That is, we need to decide whether to make all of these changes:

  • (informal) instantiation-dependent means it (somehow) depends on a template parameter or error
  • (formal) value-dependent means that (informally) an expressions's value depends on a tparam or err
  • (formal) type-dependent means that (informally) an expression's type depends on a tparam or err

The approach so far has been YES. This looks like:

  • all RecoveryExprs are value-dependent, instantiation-dependent, and contains-errors
  • RecoveryExprs are type-dependent if we didn't guess the type, or we guessed it was a dependent type

Here we make use of existing codepaths that turn off checking in the presence of dependent code. We mostly need to check contains-errors to prevent a bad-recovery cascade.

The alternate approach is NO. This looks like:

  • all RecoveryExpr are contains-errors
  • a RecoveryExpr are instantiation-dependent if a subexpression is (i.e. refers to a template param)
  • a RecoveryExpr is value-dependent if one of its direct subexpressions is, or if the broken code is likely to yield different values in different template instantiations due to the context
  • a RecoveryExpr is type-dependent if its type is known to be dependent, or if the broken code is likely to yield different types in different template instantiations due to the context

Here codepaths that turn off checking need to turn it off for errors explicitly.
There's subtlety in the value-depnedent and type-dependent bits, which probably needs to be resolved on a case-by-case basis.
And there's more machinery needed: when an expression's type is not known but isn't believed to be dependent, what is it? I think we end up needing to define RecoveryType here (the fundamental contains-errors type).

So this is doable and maybe even a good idea (not obvious to me what the practical differences are). But it's a large change, not a small one like this patch.

Option 1 (what this patch does):

instantiation-dependence still implies the expression involves a template parameter -- if a recovery expression doesn't involves a template parameter, we don't set this flag

It doesn't make sense IMO to make this change without also saying "value-dependence implies the value dependds on a template parameter". It's too surprising and hard to reason about, and I don't see a principled reason for this distinction.

Not sure what pros would buy us

That's the thing that surprises me, it's a richer model, it's principled and correct... but I have no intuition what we can use it for.

Option 2:
cons:

  • keep using the instantiation name is confusing; we could do clarify the document of isInstantiation

Yeah. I don't think actually renaming it would be appropriate, but we should be really explicit about its non-template meaning, and also give a generalized intuitive definition that makes sense. ("Depends in any way on a template parameter or error. In a template context, this implies that the resolution of this expr may cause instantiation to fail")

While writing this, I'm starting to like the option2, any other thoughts?

Yeah, I think we'd have to have some concrete benefits to pursue option 1. Doing it right is a lot of work.

Not trying to be difficult, but I'm not sure what you mean by "dependent" in #2. Informally, it means the same thing as #1. Formally, C++ defines type-dependence and value-dependence for expressions, and the ABI defines instantiation-dependence for expressions - AFAIK "dependent" doesn't have a formal meaning.

So I'd rather say it has two definitions:

informally, it (somehow) depends on a template parameter.
formally (from the C++ABI), an expression is instantiation-dependent if it is type-dependent or value-dependent, or it has a subexpression that is type-dependent or value-dependent.
To preserve the linkage between these two, IMO we need to extend the formal definition to errors if and only if we extend the informal definition to errors. Accidentally unlinking the two is almost certain to result in subtle bugs as either the implementation subtly differs from the spec, or it subtly differs from the mental model.

agree! I think the mental model you described makes more sense. yeah, it is *important* to keep the linkage between the informal and formal definition. It also explains why we should set instantiation-bit. Thanks!

ok, we have an agreement on the current approach. The only remaining thing is to clarify comments in clang. I will do it here (rather than D83215), so that this discussion thread can be found via the commit message.

hokein updated this revision to Diff 276959.Jul 10 2020, 2:25 AM

Clarify the documentation for dependence-bits, and extend it to error.

hokein updated this revision to Diff 276961.Jul 10 2020, 2:28 AM

more tweaks.

sammccall accepted this revision.Jul 10 2020, 4:53 AM
sammccall added inline comments.
clang/include/clang/AST/DependenceFlags.h
22

nit: I'd weaken usually->often

25

Maybe add ", and extend it to errors"

56

evenif > even if

63

nit: i'd just say "e.g. decltype(expr-with-errors)"

111–112

I'd rephrase as "Depends on a template parameter or error in some way. Validity depends on how the template is instantiated or the error is resolved."

clang/include/clang/AST/Expr.h
162

maybe "or an error, whose resolution is unknown"

This hints at the connection between template and error dependency, and also is more accurate for type-dependence (where sometimes we're not type dependent because the type depends on an error but we've guessed at what the resolution is)

6233

I'm not sure why this "in addition" is part of the same paragraph, it seems unrelated.

I'd move to a separate paragraph and drop "in addition".

6236

maybe "In this case, the expression is not type-dependent (unless the known type is itself dependent)"

clang/lib/AST/ComputeDependence.cpp
499

nit: I'd say "always value-dependent, and therefore instantiation dependent"
and make "contains-errors" a separate bullet at the end like "- contains errors (ExprDependence::Error), by definition"

500

opaque

502

hmm, I'd missed the type-dependent subexpressions question.
If there are type-dependent subexpressions, but a non-dependent type was specified for the RecoveryExpr, is the expr type-dependent?

This is the part that I think we have discretion over.
The definition of type-dependent does say "any type-dependent subexpression" but then lays out a list of exceptions such as casts, which are not type-dependent even if their argument is. What these have in common is that the type is known.

So I think this comes down to whether it's the caller's job to work this out, or we want to conservatively call these expressions dependent.

I think the former is probably better - marking the expression as type-dependent but not having its actual type be dependent doesn't serve any purpose I'm aware of. It's also inconsistent with the informal definition of type-dependence described earlier in this patch.

So the comment should describe the current state, but maybe a FIXME to remove the type-dependent subexpressions clause?

This revision is now accepted and ready to land.Jul 10 2020, 4:53 AM
hokein updated this revision to Diff 277332.Jul 12 2020, 11:39 PM
hokein marked 12 inline comments as done.

address comments.

hokein retitled this revision from [AST][RecoveryExpr] Don't set the instantiation-bit. to [clang][RecoveryExpr] Clarify the dependence-bits documentation..Jul 12 2020, 11:41 PM
hokein edited the summary of this revision. (Show Details)
hokein edited the summary of this revision. (Show Details)
hokein added inline comments.Jul 12 2020, 11:44 PM
clang/lib/AST/ComputeDependence.cpp
502

yeah, I think we should respect to actual type of RecoveryExpr, added a fixme.

This revision was automatically updated to reflect the committed changes.