This is an archive of the discontinued LLVM Phabricator instance.

Improve error recovery around colon.
ClosedPublic

Authored by sepavloff on May 7 2014, 10:34 AM.

Details

Summary

Recognize additional cases, when '::' is mistyped as ':' and provide
descriptive diagnostics for them.
This is a fix to RP18587 - colons have too much protection in member-declarations

Diff Detail

Repository
rL LLVM

Event Timeline

sepavloff updated this revision to Diff 9179.May 7 2014, 10:34 AM
sepavloff retitled this revision from to Improve error recovery around colon..
sepavloff updated this object.
sepavloff edited the test plan for this revision. (Show Details)
sepavloff added a subscriber: Unknown Object (MLST).
sepavloff updated this revision to Diff 9767.May 23 2014, 10:28 AM

Added tests for unnamed fields.

sepavloff updated this revision to Diff 9923.May 29 2014, 10:07 AM

Changed check as Richard recommended.

Sorry, overlooked it.

sepavloff updated this revision to Diff 10083.Jun 4 2014, 12:43 AM

Fixed parsing unnamed bit fields

Thank you for example, it represents a case missed from the patch.
Adding colon protector indeed helps for the code:

const int m = 4;
struct A {
  struct n {
    int m;
  };
  int A::n : m;
};

I found another example, that is not MS-specific and is currently
misparsed by clang:

struct A {
  enum class B {
    C
  };
};
const int C = 4;
struct D {
  A::B : C;
};

Fixing this case is a bit trickier because declspec is parsed with colon
protection turned off, otherwise we cannot recover typos like:

A:B c;

To recogneze that A::B:C is not a typo for A::B::C we must know that the
parsed scope spec is a part of type specifier.

In the updated patch TryAnnotateCXXScopeToken has additional parameter,
that provide information whether a type is expected, with this change the
above example is parsed correctly.

rsmith edited edge metadata.Jun 5 2014, 5:13 PM

Fixing this case is a bit trickier because declspec is parsed with colon
protection turned off, otherwise we cannot recover typos like:

A:B c;

With your patch, it's parsed with colon protection turned on, right? Or do we also need to add colon protection to tentative parsing of member-declarations?

If you want to detect and recover from the above case, you need more lookahead than you're currently using: you need to check if the colon is followed by *two* identifiers. If it is, you can temporarily turn off colon protection.

lib/Parse/Parser.cpp
1692–1694 ↗(On Diff #10083)

This doesn't seem correct. Even if A::B::C is a type, we still should not do the recovery here, because this may still be an unnamed bitfield.

Also, I'm concerned about whether this NeedType thing is correct, since we use TryAnnotateCXXScopeSpec from tentative parsing when we don't know whether we expect a type or not.

In D3653#10, @rsmith wrote:

Fixing this case is a bit trickier because declspec is parsed with colon
protection turned off, otherwise we cannot recover typos like:

A:B c;

With your patch, it's parsed with colon protection turned on, right? Or do we also need to add colon protection to tentative parsing of member-declarations?

With this patch declspec is parsed with colon protection turned off, this allows make recover of errors like:

struct S2b {
  C1:C2 f(C1::C2);
};

or

struct S6b {
  C1:C2 m1 : B1::B2; 
};

Colon protection is turned on when parsing declarator, - when parser expects nested-name-specifier of pointer to a member and when parsing direct-declarator.

If you want to detect and recover from the above case, you need more lookahead than you're currently using: you need to check if the colon is followed by *two* identifiers. If it is, you can temporarily turn off colon protection.

The patch uses a bit different way. If parser parses declspec and sees

A:B

it may be one of two things:

  • It is unnamed bit field. In this case A must be a type suitable for bit field and B must be a value.
  • It is typo for A::B. In this case A::B must be a type.

In both cases parser must read type first. As there is no case when both A and A::B are types suitable for declaring bit fields, these cases can be distinguished unambiguously, just by trying to build longest nested name specifier, which is still used to refer to a type. If A::B is a type or namespace, A cannot be used as a type of a bit field and recovery A:B -> A::B is valid. Without scoped enums it would be enough to check if A::B exists, that is why we need to pass NeedType when analyzing C++ scope.

lib/Parse/Parser.cpp
1692–1694 ↗(On Diff #10083)

Whether the recovery is attempted is determined by ColonIsSacred field. NeedType only assists the recovering when ColonIsSacred is false. By default NeedType is false and behavior is identical to the current one. In the patch it is set only when parsing declspec, to correctly handle case when A is scoped enum and A::B is valid.

sepavloff updated this revision to Diff 10967.Jun 29 2014, 11:47 AM
sepavloff edited edge metadata.

Parse declspec with colon protection turned on.

It reduces number of cases where colon recovery is possible, but looks more reliable.

rsmith accepted this revision.Jul 8 2014, 10:49 AM
rsmith edited edge metadata.

Looks great, thanks!

lib/Parse/ParseDecl.cpp
4531–4534 ↗(On Diff #10967)

Please reindent this to make the precedence more obvious (or clang-format it). [Last 3 lines should be one space further to the right.]

This revision is now accepted and ready to land.Jul 8 2014, 10:49 AM
sepavloff closed this revision.Jul 14 2014, 9:50 AM
sepavloff updated this revision to Diff 11391.

Closed by commit rL212957 (authored by @sepavloff).