This is an archive of the discontinued LLVM Phabricator instance.

[AST] Preserve the DeclRefExpr when it refers to an invalid decl.
Needs ReviewPublic

Authored by hokein on Mar 26 2020, 2:41 AM.

Details

Reviewers
adamcz
sammccall
Summary

Previously, clang refused to build the AST nodes for them.

Diff Detail

Event Timeline

hokein created this revision.Mar 26 2020, 2:41 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 26 2020, 2:41 AM

I like the idea, but it seems to regress quite a lot of diagnostics... e.g. where we fail to deduce auto and then say it's not a pointer.

Also this is regressing things in the -fno-recovery-ast case, because of the changes to CheckDeclInExpr with the existing callsites that allow invalid decls. (I'm not sure what passing AcceptInvalid actually *did* before)

hokein added a comment.Apr 3 2020, 5:57 AM

I like the idea, but it seems to regress quite a lot of diagnostics... e.g. where we fail to deduce auto and then say it's not a pointer.

Also this is regressing things in the -fno-recovery-ast case, because of the changes to CheckDeclInExpr with the existing callsites that allow invalid decls. (I'm not sure what passing AcceptInvalid actually *did* before)

Yeah, the solution makes diagnostics worse...

The motivation of the patch is to preserve the DeclRefExpr nodes even the referenced var-decl is invalid, so that clangd features can still work, e.g. the case we care about

struct A { A(int); }
void t() {
  A a; // VarDecl a is invalid.
  a.^ // none of clangd features would work as we don't have any AST node
}

we have a few options:

  1. build DeclRefExpr regardless the validity of VarDecl;
  2. mark VarDecl valid even when there are errors during initialization;
  3. using RecoveryExpr, e.g. build a RecoveryExpr that wraps a DeclRefExpr if the VarDecl is invalid;

as shown in this patch, 1 is not ideal and hurts about clang diagnostics. 2 looks promising, and safer (probably not regress clang diagnostic a lot).

I think VarDecls in following cases should be valid. Interestingly, clang doesn't seem to have consistent behavior -- some of them are valid already, so 2 has the greatest chance of not introducing large regressions.

struct A {
  A(int, int);
};

void test() {
  // 1) default initialization (without any initializers)
  A a1; // invalid

  // 2) direct initialization
  A a2(1); // invalid
  A a3{1}; // invalid
  A a4(invalid()); // valid
  A a5{invalid()}; // valid

  int var;
  // 3) copy initializaiton
  A a6 = 1; // invalid
  A a7 = A(1); // valid
  A a8 = A(var); // valid
  A a10 = A{1}; // valid
  A a9 = {1}; // invalid
}
nridge added a subscriber: nridge.Apr 4 2020, 4:55 PM

Yeah, so I guess the point of marking decls invalid is in large part to stop recovery using it.
It's not surprising 1 has recovery regressions. 3 should have fewer (basically because you'll get the dependent bits too), but likely still some.
2 definitely seems the way to go given that this is currently inconsistent between different invalid ways of initializing a variable. Thanks for D77395!

rsmith added a subscriber: rsmith.Apr 9 2020, 2:56 PM

Generally I think we should be moving towards finer-grained "invalid" / "contains errors" bits, so that we can preserve as much of the AST as possible and produce accurate diagnostics for independent errors wherever possible. To that end, I think generally the "invalid" bit on a declaration should concern *only* the "declaration" part of that declaration (how it's seen from other contexts that don't "look too closely"). So in particular:

  • The initializer of a variable should play no part in its "invalid" bit. If the initializer expression is marked as "contains errors", then things that care about the initializer (in particular, constant evaluation and any diagnostics that look into the initializer) may need to bail out, but we should be able to form a DeclRefExpr referring to that variable -- even if (say) the variable is constexpr. Exception: if the variable has a deduced type and the type can't be deduced due to an invalid initializer, then the declaration should be marked invalid.
  • The body of a function should play no part in its "invalid" bit. (This came up in a different code review recently.)

We shouldn't build a DeclRefExpr to a declaration whose "invalid" bit is set, because (for example) we don't necessarily know what type that expression has. Building a RecoveryExpr in that case seems reasonable.

@hokein I think this patch is obsolete, right?

Richard's last comment is basically what's currently implemented: if lookup doesn't yield any valid results, we create a recoveryexpr for the failed lookup. It doesn't have a pointer to the invalid decl (which would be neat) but it doesn't seem like we're planning to fix that.

sammccall resigned from this revision.Feb 12 2021, 7:42 AM