This is an archive of the discontinued LLVM Phabricator instance.

[AST] Fix the source range for TagDecl if there is no matched } brace.
Needs ReviewPublic

Authored by hokein on May 25 2020, 1:10 AM.

Details

Reviewers
sammccall
akyrtzi
Summary

The AST is preserved when the TagDecl misses a } brace, but the source
range seems incorrect (just covers the name, not the body, because
RBraceLoc is invalid and we fallback to NameLoc). This breaks the
SelectionTree in clangd -- clangd skips the TagDecl entirely.

Diff Detail

Event Timeline

hokein created this revision.May 25 2020, 1:10 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 25 2020, 1:10 AM

Argh I never submitted these comments, sorry.

clang/lib/AST/Decl.cpp
4129

I'm confused about this change.

We were using BR.end if it was valid, now we're using it if BR.start is valid. So this changes behavior in two cases:

  • BR.start is valid, BR.end is invalid: old was {getOuterLocStart(), getLocation()}. new is {getOuterLocStart(), invalid}
  • BR.end is valid, BR.start is invalid: old was {getOuterLocStart(), BraceRange.getEnd()}. new is {getOuterLocStart(), getLocation()}

These both seem worse to me, what am I missing?

Patch description refers to the first case, I wonder if this is just a bug in clangd cancelling out a bad AST (does it just mark tokens until EOF if the end location is invalid?)

clang/test/AST/ast-dump-invalid-brace.cpp
6

or // missing: }

hokein marked an inline comment as done.May 28 2020, 8:33 AM
hokein added inline comments.
clang/lib/AST/Decl.cpp
4129

We were using BR.end if it was valid, now we're using it if BR.start is valid. So this changes behavior in two cases:

BR.start is valid, BR.end is invalid: old was {getOuterLocStart(), getLocation()}. new is {getOuterLocStart(), invalid}
BR.end is valid, BR.start is invalid: old was {getOuterLocStart(), BraceRange.getEnd()}. new is {getOuterLocStart(), getLocation()}

case 1), I think this is intended, though it looks like worse at first glance :( Invalid source location has subtle semantic in clang -- "BR.end() is invalid" implies that there is no } in the source code.

And Missing-} namespaceDecl in today's clang is using invalid loc (not the NameLoc) as the end loc:

// `-NamespaceDecl 0x8d27728 </tmp/t.cpp:2:1, <invalid sloc>> col:11 abc
namespace abc {
// missing }

case 2) is less interesting, I wonder whether we will encounter this case in practice, tried and failed to come up with an example to it happen. I think we can preserve the old behavior in this patch.

I wonder if this is just a bug in clangd cancelling out a bad AST (does it just mark tokens until EOF if the end location is invalid?)

I assume "clangd" you mean "clang"? I think clang's behavior is fine, the AST is recovered well, it is an issue that the TagDecl::getSourceRange doesn't correctly claim the range of TagDecl.

e.g.

class Test {
  int field;
// missing }
// `-CXXRecordDecl 0x96dd738 </tmp/t.cpp:1:1, col:7> col:7 class Test definition
//   |-CXXRecordDecl 0x96dd878 <col:1, col:7> col:7 implicit class Test
//   `-FieldDecl 0x96dd940 <line:2:3, col:7> col:7 field 'int'

And we may not set the BraceRange.end to the EOF position -- as it changes the semantic, making it valid implies we have a } in the source code.

sammccall added inline comments.May 28 2020, 9:03 AM
clang/lib/AST/Decl.cpp
4129

No, I meant clangd. Invalid source ranges hit codepaths in Selection.cpp that isn't deliberately handling them, and we're getting "random" results that may happen to do what you want in some cases, but...

e.g. mayHit returns true if the range isn't valid, which is probably good, but the reason is bizarre: it thinks the range is from a macro expansion.

However claimRange seems to claim no tokens. So I'd guess hover, find references etc on ABC wouldn't work after this patch, so there's regressions there too.

Still even if it breaks clangd, the most important question here is how *should* the source range be represented. Is a half-broken range actually useful? Maybe it's namespacedecl that should be "fixed".

hokein marked an inline comment as done.Jun 2 2020, 2:17 AM
hokein added inline comments.
clang/lib/AST/Decl.cpp
4129

ok, that's fair enough, so the current situation is

  • braceRange has its semantics, we should keep the EndLoc as Invalid;
  • getSourceRange() returns a wrong range ({getOuterLocStart(), getLocation()}, {getOuterLocStart(), invalid} are not ideal), which doesn't reflect the AST;

Idea: we can add a new member EndOfLoc in TagDecl, and getSourceRange returns {getOuterLocStart(), EndOfLoc}, but this will increase the TagDecl size by 8 bytes, WDYT?