Page MenuHomePhabricator

Introduce -Wreserved-identifier

Authored by serge-sans-paille on Dec 11 2020, 2:03 AM.



Warn when a declaration uses an identifier that doesn't obey the reserved
identifier rule from C and/or C++.

Do not perform the check at preprocessor level because:

  1. it will be quite costly
  2. we don't have accurate linkage info at that point

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
rsmith added inline comments.Feb 3 2021, 3:42 PM

It'd be useful to test that we don't diagnose

void foo(unsigned int _not_reserved) { ... }

You don't seem to have a test for TUK_Reference:

struct _Zebulon3 *p;

(Nor for TUK_Friend.)


_preserved is not an identifier with file scope, so I think we shouldn't warn here. Perhaps the problem is that we're doing this check before the struct gets reparented into the function declaration (which only happens after we finish parsing all the parameters and build the function declaration).

We could perhaps check the Scope in addition to checking the DeclContext to detect such cases.


You mean, specifically for _strdup? In general, this looks exactly the like the kind of declaration we'd want to warn on. If we don't want a warning here, we could perhaps recognize _strdup as a builtin lib function. (I think they shouldn't warn, because we'll inject a builtin declaration, so this would be a redeclaration. We should test that!)


Please also test the _not_reserved case here. Parameters are weird, because they're parsed before we have a DeclContext created to correspond with their scope, so they start off with the translation unit as their parent.


This should warn: this declaration would conflict with an int _barbeFleurie(); in a system header.

... although weirdly Clang doesn't diagnose that conflict. Hm. That looks like a bug.

mizvekov added a subscriber: mizvekov.EditedFeb 3 2021, 4:29 PM

There is a GNU extension where the linker provides symbols for section / start end, and using these with this patch would produce a warning.

[[gnu::section("foo")]] void bar() {}
extern "C" void __start_foo();
extern "C" void __stop_foo();
aaron.ballman added inline comments.Feb 4 2021, 5:22 AM

_preserved is not an identifier with file scope, so I think we shouldn't warn here.

Hmm, my thinking was that if the implementation added a conflicting definition of what _preserved is, it'd break the user's code. But you're right, the identifier is in the tag name space but not with file scope, so not warning is correct.


You mean, specifically for _strdup?

Yes, but it's a more general pattern than just _strdup. C code (esp older C code) will sometimes add an external declaration to avoid having to use a #include for just one symbol. Additionally, some popular platforms (like Windows) add a bunch of functions in the implementation namespace like _strdup (and many, many others).

Perhaps an option would be to always warn, and if we find that users run into this situation a lot in practice, we could split the diagnostic so that users can control whether to warn on forward declarations of functions.

Patch rebased on main.

Updated error message to report the reason why an identifier is reserved.

Fix some formatting
Support literal operator

aaron.ballman added inline comments.Mar 16 2021, 12:58 PM

Because this is not a scoped enum, should these enumerators get an RIR_ prefix added to them?


Any reason not to return the enumeration as the result? We could add NotReserved = 0 to the enumeration so that calls to D->isReserved() will do the right thing.

(Also, might as well name the LangOptions parameter here.)


Were you planning to use this new diagnostic group?


Might as well address the lint warning.


Adress final(?) comments from @rsmith and @aaron.ballman :
Don't warn on argument (or template argument) of top-level decl.
Extra test cases
Return an enumeration as part of the test

serge-sans-paille marked 10 inline comments as done.Mar 29 2021, 4:12 AM
serge-sans-paille added inline comments.

I used a class enum with a stream operator instead. Not a big fan of prefixed enum ;-)


yeah,that would require checking if the name with trailing underscores is a libc function. I agree we should wait for more user feedback to install such check.

aaron.ballman added inline comments.Mar 30 2021, 8:55 AM

Totally fine by me, thank you!


You may want to add a comment here to alert people to the fact that passing 0 for %1 is not valid but the | exists to make the enum order match the diagnostic. Otherwise it looks like it's possible to get a diagnostic identifier 'whatever' is reserved because .


Still missing a full stop at the end of the comment here.


Richard's comment is still unaddressed as well.


I'm still wondering about this as well.


This check should be reversed as well.

serge-sans-paille marked 10 inline comments as done.

Address all the needed changes pointed out by @aaron.ballman *except* the most critical one on the call to warnOnReservedIdentifier being spread at several place in Sema. I'll try to find a better approach for that particular point.


I'll investigate that one more deeply, I do share your concern.

rsmith added inline comments.Mar 30 2021, 4:22 PM

This is hopefully more useful to future readers and also terser. (I like including <ERROR> in the diagnostic in the bad case because if it ever happens we're more likely to get a bug report that way.)


Please reorder the literal operator case after the plain getIdentifier case. I think this'd be more readable if the common and expected case is first, and it doesn't make any functional difference since a literal operator doesn't have a "regular" identifier name.


Would it make sense to move the rest of this function to a member on IdentifierInfo? That is, the rest of this function would become:

ReservedIdentifierStatus Status = II->isReserved(LangOpts);
if (Status == ReservedIdentifierStatus::StartsWithUnderscoreAtGlobalScope &&
    isa<ParmVarDecl>(this) || isTemplateParameter() ||
  return ReservedIdentifierStatus::NotReserved;
return Status;

That way, we should be able to reuse the same implementation to detect reserved macros too.


This can be simplified a bit; I suggested edits above. Also, we want the semantic parent here, not the lexical parent, in order to warn on cases like

struct A {
  friend struct _b; // warning, starts with underscore at global scope
  friend void _f(); // warning, starts with underscore at global scope
void g() {
  void _h(); // warning, starts with underscore at global scope
  extern int _n; // warning, starts with underscore at global scope

Another round of review :-) I addressed both the scattering of warnOnReservedIdentifier and the code split between Decl.cpp and IdentifierInfo.cpp.


I did the II->isReserved(LangOpts) part, but the updated condition !getDeclContext()->getRedeclContext()->isTranslationUnit() fails the validation. I did already spend a lot of time getting one that catches the behavior we wanted, so 'm not super-eager to investigate why this change would not work ;-)

rsmith added inline comments.Mar 31 2021, 10:48 AM

The current lexical context check does not look correct -- the lexical context is not relevant here, because the language rule is constrained by the semantic context. What test cases does the new condition fail?

Please also add something like the testcases from my other comment; I can't find any tests for friends or for local extern declarations in the current set of tests, and that's the kind of corner case that would be affected by this.

Do not use lexical parent, as suggested by @rsmith
Add test case for extern function worward-declared in function body, as suggested by @rsmith

Warn on friend functions. I failed to support friend classes, but they are only declared and not defined, so that should be fine, right?

serge-sans-paille marked 3 inline comments as done.Apr 1 2021, 6:00 AM
aaron.ballman added inline comments.Apr 1 2021, 7:49 AM

Ooh, I like that suggestion, thanks!


It looks like this code still needs to move.


This comment should be updated -- we no longer walk the lexical parents.


I think comments here may help -- we don't know that the identifier is at global scope within this call, so the return value looks a bit strange. This is a case where the enumerator name is a bit awkward in this context but makes a lot of sense in the context of the check.


There's a slight functionality change here in that the code used to allow identifiers that start with a single underscore followed by a lowercase letter and that now early returns. Is that expected (and tested)?


Spurious whitespace change.


Spurious whitespace change.


Doesn't PushOnScopeChains() do this already?


This still needs to be reversed so we check for system headers after checking the reserved status.


Is this handled by PushOnScopeChains()?

Warn on friend functions. I failed to support friend classes, but they are only declared and not defined, so that should be fine, right?

They can still conflict, so I think we should warn. For example:

// user code
struct A { friend struct _b; };
// system header
enum _b {}; // error

This should be declared with the definition of ReservedIdentifierStatus -- or simply removed. We usually include the cast at the diagnostic emission site, which I find makes it locally clear that we're making an assumption about the numerical values of the enumerators.


Some of the uses of this warning are for non-external identifiers. It'd also be nice for the diagnostics in this area to have consistent names. Currently we have: -Wreserved-id-macro, -Wreserved-extern-identifier, -Wreserved-identifier. I'd like to see either consistent use of "identifier" or consistent use of "id' (preferably "identifier") and consistent word order.


Should we instead be asking whether CalleeDecl has a reserved name, now that we can?


Should we be using ND->isReserved here?


I'm surprised we don't warn in user headers. Is that intentional?

serge-sans-paille marked 10 inline comments as done.

Take into account latest reviews

aaron.ballman accepted this revision.May 3 2021, 5:47 AM

Aside from the formatting nit and the question about the diagnostic, I think this LGTM!


Do we want to keep this old name around as an alias to the new name (this makes upgrades easier on users who mention the diagnostic by name in their build script or within diagnostic pragmas)?


The formatting issue is worth fixing.

This revision is now accepted and ready to land.May 3 2021, 5:47 AM
This revision was landed with ongoing or failed builds.May 4 2021, 2:20 AM
This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptMay 4 2021, 2:20 AM