This is an archive of the discontinued LLVM Phabricator instance.

[pseudo][wip] Enforce the C++ rule for the optional init-declarator-list.
Needs ReviewPublic

Authored by hokein on Aug 12 2022, 5:00 AM.

Details

Reviewers
sammccall
Summary

Basically it implements the https://eel.is/c++draft/dcl.pre#5 rule.

The motivation is to eliminate a false parse (simple declaration) for the enum
declaration.

// A simple declaration without a declarator
// or an opaque-enum-declaration
enum a;

However, it has some negative effects on broken code (with the single-type
decl-specifier-seq guard):

  • break the const int; case (which we used to parse as a declaration without declarator), now we fails to parse, because it is invalid code per the C++ spec.
  • regress the const Foo; case, now we parse it as a declaration with a declarator named Foo and const as the decl-specifier-seq, vs we used to parsed as a declaration without declarator (which is better IMO);

Just post here to get thoughts about it:
My feeling is that this degrades the parser a lot for the little benefit we gain,
we probably don't move forward with this approach.

Diff Detail

Event Timeline

hokein created this revision.Aug 12 2022, 5:00 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 12 2022, 5:00 AM
hokein requested review of this revision.Aug 12 2022, 5:00 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 12 2022, 5:00 AM

Basically it implements the https://eel.is/c++draft/dcl.pre#5 rule.

Just to make sure we're on the same page, my understanding of that rule is roughly "declarations must declare something", and it has two parts:

  • the declarators can only be omitted if a named class/enum is declared (this patch)
  • "shall (re)introduce one or more names": if the declarators are omitted, either the class/enum must be named or the enum must provide enumerators (not in this patch)

This corresponds to clang's -Wmissing-declarations, which is a warning rather than an error! I think this is an argument to allow these parses at least in cases when there's no correct parse.
Ideally we'd eliminate the wrong one when there's actual ambiguity, this isn't really doable with guards, the closest mechanism we have is "soft" disambiguation.

regress the const Foo; case, now we parse it as a declaration with a declarator named Foo and const as the decl-specifier-seq, vs we used to parsed as a declaration without declarator (which is better IMO);

I don't think it's clear. Foo could easily be a easily type with missing variable name or a variable name with missing type. A human would interpret this based on the other occurrences of Foo, based on case, etc. I think this is an argument in favor of solving this "softly" with ranking (though not a strong one: we can't let 'correctly' parsing broken code drive the parser *too* much).

My feeling is that this degrades the parser a lot for the little benefit we gain, we probably don't move forward with this approach.

I agree, I think we should solve this by downranking e.g. declarator-less declarations (so they're only used if needed). It's not ideal (I'd like to completely drop the misparse of enum foo; and similar cases where there's a valid alternative), but this isn't an *incredibly* common pattern that demands complete early removal from the forest.