This is an archive of the discontinued LLVM Phabricator instance.

[clang-tidy] cppcoreguidelines-interfaces-global-init
ClosedPublic

Authored by courbet on Mar 31 2016, 8:30 AM.

Details

Summary

This check flags initializers of globals that access extern objects, and therefore can lead to order-of-initialization problems (this recommandation is part of CPP core guidelines).
Note that this only checks half of the guideline for now (it does not enforce using constexpr functions).

Diff Detail

Repository
rL LLVM

Event Timeline

courbet updated this revision to Diff 52213.Mar 31 2016, 8:30 AM
courbet retitled this revision from to [clang-tidy] cppcoreguidelines-interfaces-global-init.
courbet updated this object.
courbet added a reviewer: alexfh.
courbet added a subscriber: cfe-commits.

Please mention new check in docs/ReleaseNotes.rst.

Please mention new check in docs/ReleaseNotes.rst.

That should be in a different commit, right ? Release notes are in a different repo (forgive my ignorance, this is my first patch).

alexfh edited edge metadata.Apr 1 2016, 5:45 AM

Please mention new check in docs/ReleaseNotes.rst.

That should be in a different commit, right ? Release notes are in a different repo (forgive my ignorance, this is my first patch).

Now they are in the same repo (clang-tools-extra, docs/ReleaseNotes.rst).

courbet updated this revision to Diff 52353.Apr 1 2016, 5:55 AM
courbet edited edge metadata.

Updated release notes.

alexfh added a comment.Apr 1 2016, 6:38 AM

Thank you for working on the new clang-tidy check!

We usually recommend authors to run their checks on a large code base to ensure it doesn't crash and doesn't generate obvious false positives. It would be nice, if you could provide a quick summary of such a run (total number of hits, number of what seems to be a false positive in a sample of ~100).

clang-tidy/cppcoreguidelines/InterfacesGlobalInitCheck.cpp
41 ↗(On Diff #52213)

nit: No need for braces around single-line if/for/... bodies.

clang-tidy/cppcoreguidelines/InterfacesGlobalInitCheck.h
20 ↗(On Diff #52213)

I'd leave the "For user-facing documentation, see ..." phrase that the add_new_check.py script adds.

test/clang-tidy/cppcoreguidelines-interfaces-global-init.cpp
37 ↗(On Diff #52213)

These should be on a single line. I guess, your clang-format doesn't use -style=file by default and doesn't understand the local style configured for the test/ directory. Otherwise it wouldn't split the lines.

some nits.

clang-tidy/cppcoreguidelines/InterfacesGlobalInitCheck.cpp
21 ↗(On Diff #52353)

nit: const (and below)

test/clang-tidy/cppcoreguidelines-interfaces-global-init.cpp
42 ↗(On Diff #52353)

nit: initilization -> initialization

courbet updated this revision to Diff 52366.Apr 1 2016, 6:53 AM
courbet marked 2 inline comments as done.
courbet removed a subscriber: etienneb.

Style fixes.

Looks like there's a race condition in phabricator.

test/clang-tidy/cppcoreguidelines-interfaces-global-init.cpp
37 ↗(On Diff #52213)

Thanks for the hint. I was using -style=LLVM indeed.

more nits, sorry didn't saw them first time.

test/clang-tidy/cppcoreguidelines-interfaces-global-init.cpp
32 ↗(On Diff #52366)

nit: namspace -> namespace

43 ↗(On Diff #52366)

nit: remove empty line here.

courbet updated this revision to Diff 52370.Apr 1 2016, 7:26 AM
courbet marked 3 inline comments as done.

Constness + typos.

alexfh added inline comments.Apr 1 2016, 8:22 AM
clang-tidy/cppcoreguidelines/InterfacesGlobalInitCheck.cpp
46 ↗(On Diff #52370)

nit: no trailing period needed in diagnostic messages

47 ↗(On Diff #52370)

This might work even better without ->getName(). Not sure, but it makes sense to try.

docs/clang-tidy/checks/cppcoreguidelines-interfaces-global-init.rst
7 ↗(On Diff #52370)

What about indirect access of other globals, e.g. via function calls? We obviously can't analyze functions defined in a different translation unit, and I'm not sure we need to analyze even those defined in the same TU (since it may be imprecise and quite expensive), but maybe we should use some heuristic here (the most agressive would be to flag all initializers containing function calls, but that might be rather noisy).

alexfh requested changes to this revision.Apr 1 2016, 8:23 AM
alexfh edited edge metadata.
This revision now requires changes to proceed.Apr 1 2016, 8:23 AM
courbet added a comment.EditedApr 4 2016, 12:19 AM

Thank you for working on the new clang-tidy check!

We usually recommend authors to run their checks on a large code base to ensure it doesn't crash and doesn't generate obvious false positives. It would be nice, if you could provide a quick summary of such a run (total number of hits, number of what seems to be a false positive in a sample of ~100).

The tool generated 20k positives on our codebase. On a sample of 100, there are:

  • 8 instances of the same exact code structure that's just wrong: const string var = FLAGS_some_flag + "some_sufix";
  • 8 false positives.
  • 84 possible issues. (interestingly 6 of these are from premature use of variations of "extern char* empty_string;"

The false positives fall into 3 categories:

  1. 3 variations of:
extern int i;
static const int* pi = &i;  // diag

// Then pi is dereferenced later, once i is intialized.
Public example of this: https://github.com/python-git/python/blob/py3k/Objects/dictobject.c#L2027

  1. 3 variations of:
// .h
class A {
  static const int i = 42;
};
// .cc
int A::i;  // diag
  1. 2 variations of:
// .h
class A {
static int i;
static int j;
};
// .cc
int A::i = 0;
int A::j = i;  // diag
courbet updated this revision to Diff 52541.Apr 4 2016, 4:04 AM
courbet edited edge metadata.
courbet marked 2 inline comments as done.

cosmetics + more documentation

docs/clang-tidy/checks/cppcoreguidelines-interfaces-global-init.rst
7 ↗(On Diff #52370)

The guidelines actually recommend flagging calls non-constexpr functions, but this would give too many positives. I've added a note.

alexfh added a comment.Apr 4 2016, 4:40 AM

Thank you for working on the new clang-tidy check!

We usually recommend authors to run their checks on a large code base to ensure it doesn't crash and doesn't generate obvious false positives. It would be nice, if you could provide a quick summary of such a run (total number of hits, number of what seems to be a false positive in a sample of ~100).

The tool generated 20k positives on our codebase. On a sample of 100, there are:

  • 8 instances of the same exact code structure that's just wrong: const string var = FLAGS_some_flag + "some_sufix";
  • 8 false positives.
  • 84 possible issues. (interestingly 6 of these are from premature use of variations of "extern char* empty_string;"

The false positives fall into 3 categories:

  1. 3 variations of:
extern int i;
static const int* pi = &i;  // diag

Should we warn at all when only an address of a global variable is used?

// Then pi is dereferenced later, once i is intialized.
Public example of this: https://github.com/python-git/python/blob/py3k/Objects/dictobject.c#L2027

  1. 3 variations of:
// .h
class A {
  static const int i = 42;
};
// .cc
int A::i;  // diag

Looks like we have all information to fix this kind of a false positive.

  1. 2 variations of:
// .h
class A {
static int i;
static int j;
};
// .cc
int A::i = 0;
int A::j = i;  // diag

ditto

alexfh requested changes to this revision.Apr 4 2016, 4:41 AM
alexfh edited edge metadata.
This revision now requires changes to proceed.Apr 4 2016, 4:41 AM

Thank you for working on the new clang-tidy check!

We usually recommend authors to run their checks on a large code base to ensure it doesn't crash and doesn't generate obvious false positives. It would be nice, if you could provide a quick summary of such a run (total number of hits, number of what seems to be a false positive in a sample of ~100).

The tool generated 20k positives on our codebase. On a sample of 100, there are:

  • 8 instances of the same exact code structure that's just wrong: const string var = FLAGS_some_flag + "some_sufix";
  • 8 false positives.
  • 84 possible issues. (interestingly 6 of these are from premature use of variations of "extern char* empty_string;"

The false positives fall into 3 categories:

  1. 3 variations of:
extern int i;
static const int* pi = &i;  // diag

Should we warn at all when only an address of a global variable is used?

We could, but it would allow stuff like:

extern const int i;
const char c = *(const char*)&i;

No strong opinion here.

// Then pi is dereferenced later, once i is intialized.
Public example of this: https://github.com/python-git/python/blob/py3k/Objects/dictobject.c#L2027

  1. 3 variations of:
// .h
class A {
  static const int i = 42;
};
// .cc
int A::i;  // diag

Looks like we have all information to fix this kind of a false positive.

  1. 2 variations of:
// .h
class A {
static int i;
static int j;
};
// .cc
int A::i = 0;
int A::j = i;  // diag

ditto

It turns out (2) and (3) are just variations of the same - (2) was just:

// .h
class A {
  static const int i = 0;
  static const int j = i;
};
// .cc
int A::i;
int A::j;

Which has *exactly* the same semantics as (3).

I could not find a way to go from a decl to its (possible) definition. I was thinking of matching all global definitions that are not declarations and then making sure that the current matches are not referring to one of those definitions (in syntactic order because of [3.6.2.3]). Does this sound like a reasonable idea ?

alexfh added a comment.Apr 4 2016, 7:47 AM

Thank you for working on the new clang-tidy check!

We usually recommend authors to run their checks on a large code base to ensure it doesn't crash and doesn't generate obvious false positives. It would be nice, if you could provide a quick summary of such a run (total number of hits, number of what seems to be a false positive in a sample of ~100).

The tool generated 20k positives on our codebase. On a sample of 100, there are:

  • 8 instances of the same exact code structure that's just wrong: const string var = FLAGS_some_flag + "some_sufix";
  • 8 false positives.
  • 84 possible issues. (interestingly 6 of these are from premature use of variations of "extern char* empty_string;"

The false positives fall into 3 categories:

  1. 3 variations of:
extern int i;
static const int* pi = &i;  // diag

Should we warn at all when only an address of a global variable is used?

We could, but it would allow stuff like:

extern const int i;
const char c = *(const char*)&i;

No strong opinion here.

// Then pi is dereferenced later, once i is intialized.
Public example of this: https://github.com/python-git/python/blob/py3k/Objects/dictobject.c#L2027

  1. 3 variations of:
// .h
class A {
  static const int i = 42;
};
// .cc
int A::i;  // diag

Looks like we have all information to fix this kind of a false positive.

  1. 2 variations of:
// .h
class A {
static int i;
static int j;
};
// .cc
int A::i = 0;
int A::j = i;  // diag

ditto

It turns out (2) and (3) are just variations of the same - (2) was just:

// .h
class A {
  static const int i = 0;
  static const int j = i;
};
// .cc
int A::i;
int A::j;

Which has *exactly* the same semantics as (3).

I could not find a way to go from a decl to its (possible) definition.

VarDecl is also Redeclarable<VarDecl>, which has various ways to iterate over all related declarations.

I was thinking of matching all global definitions that are not declarations and then making sure that the current matches are not referring to one of those definitions (in syntactic order because of [3.6.2.3]). Does this sound like a reasonable idea ?

courbet updated this revision to Diff 52684.Apr 5 2016, 7:13 AM
courbet edited edge metadata.

Fix 5 of the 8 false positives. Add tests.

courbet updated this revision to Diff 52688.Apr 5 2016, 7:32 AM
courbet edited edge metadata.

Update naming to be constistent with the standard: use "non-local" instead of "global" (or "static" in my case, as I started from classes and the name stuck).

aaron.ballman added a subscriber: aaron.ballman.
aaron.ballman added inline comments.
clang-tidy/cppcoreguidelines/InterfacesGlobalInitCheck.cpp
27 ↗(On Diff #52688)

This can be moved into ReferencesUndefinedGlobalVar

courbet updated this revision to Diff 52696.Apr 5 2016, 8:48 AM
courbet marked an inline comment as done.

cosmetics

alexfh added inline comments.Apr 7 2016, 11:20 AM
test/clang-tidy/cppcoreguidelines-interfaces-global-init.cpp
7 ↗(On Diff #52696)

What happens if you add:

extern int ExternGlobal;
extern int ExternGlobal;
int ExternGlobal = 123;

?

alexfh requested changes to this revision.Apr 7 2016, 11:24 AM
alexfh edited edge metadata.
This revision now requires changes to proceed.Apr 7 2016, 11:24 AM
courbet updated this revision to Diff 52999.Apr 8 2016, 1:31 AM
courbet edited edge metadata.

Add unit test for multiple declaration then definition.

courbet added inline comments.Apr 8 2016, 1:32 AM
test/clang-tidy/cppcoreguidelines-interfaces-global-init.cpp
7 ↗(On Diff #52696)

It does not trigger (as expected). I've added a test.

alexfh accepted this revision.Apr 8 2016, 1:53 AM
alexfh edited edge metadata.

Looks good! Thank you for the new check!

Do you need me to submit the patch for you?

This revision is now accepted and ready to land.Apr 8 2016, 1:53 AM
alexfh added a comment.Apr 8 2016, 1:54 AM

(if you do, please rebase the patch on top of HEAD)

courbet updated this revision to Diff 53001.Apr 8 2016, 2:22 AM
courbet edited edge metadata.

Rebase on HEAD for submission.

That'd be great. Thanks all for the review.

This revision was automatically updated to reflect the committed changes.