Page MenuHomePhabricator

[clang-tidy] Add a check to detect returning static locals in public headers
Needs ReviewPublic

Authored by jranieri-grammatech on Nov 7 2018, 12:55 PM.

Details

Summary

This adds a checker, llvm-problematic-statics, that detects functions in header files that return the address of a local static variable. These cause problems if the resulting pointer is expected to be unique because it may have different values across shared library boundaries.

It's has been a problem in the static analyzer that I've had to deal with in the past (https://reviews.llvm.org/D52905 and https://reviews.llvm.org/D52906).

Diff Detail

Event Timeline

C++ Core Guidelines also mentioned similar patterns, so may be this check belongs to bugprone module?

docs/clang-tidy/checks/llvm-problematic-statics.rst
7 ↗(On Diff #173008)

static local variable? Probably Looks for should be replaced with Detects. Please also wrap with 80 characters.

Eugene.Zelenko added a project: Restricted Project.
Eugene.Zelenko retitled this revision from [clang-tidy/checks] Add a checker to detect returning static locals in public headers to [clang-tidy] Add a check to detect returning static locals in public headers.

I agree with @Eugene.Zelenko that this check should rather live in bugprone- as the issue its diagnosing is not LLVM specific. It is ok to add an alias into the llvm module.

clang-tidy/llvm/ProblematicStaticsCheck.cpp
22 ↗(On Diff #173008)

Nit: you can inline this variable, having it does not improve readability (IMHO) nor does it reduce duplication.

26 ↗(On Diff #173008)

I think the matching on include should not be done. Maybe it's ok to filter for header files, but we should not assume they live in include/... (as i think this check is generally useful and not too LLVM specific).

More filters for false positives could be checking if the function itself is static or in an anonymous namespace.

33 ↗(On Diff #173008)

Please assert on VD and Return

test/clang-tidy/llvm-problematic-statics.cpp
3 ↗(On Diff #173008)

I think the testing should include more cases. At least normal functions, and a false case as well.

aaron.ballman added inline comments.Nov 7 2018, 3:09 PM
clang-tidy/llvm/ProblematicStaticsCheck.cpp
33 ↗(On Diff #173008)

Please assert on VD and Return

Why?

Please mention new check in Release Notes (in alphabetical order).

JonasToth added inline comments.Nov 8 2018, 2:35 PM
clang-tidy/llvm/ProblematicStaticsCheck.cpp
33 ↗(On Diff #173008)

because the matcher might evolve over time. Right now this can not be triggered, that is correct. Asking for it was inappropriate but having it doesn't hurt.

aaron.ballman added inline comments.Nov 8 2018, 2:39 PM
clang-tidy/llvm/ProblematicStaticsCheck.cpp
33 ↗(On Diff #173008)

All checks may evolve over time -- this isn't a pattern I'd like to see used in this sort of case, personally.

JonasToth added inline comments.Nov 8 2018, 3:20 PM
clang-tidy/llvm/ProblematicStaticsCheck.cpp
33 ↗(On Diff #173008)

I don't insist.

aaron.ballman added inline comments.Nov 8 2018, 3:44 PM
docs/clang-tidy/checks/llvm-problematic-statics.rst
7 ↗(On Diff #173008)

to have the address -> to have the same address

NoQ added a comment.Nov 9 2018, 5:27 PM

Thanks! I don't have much to add to the review, but i very much approve this check.

ping.
This checks seems pretty much done. @jranieri-grammatech do you think there is something missing? Otherwise it could be rebased and comitted?

@JonasToth It looks like there are some outstanding review comments that need to be addressed. I've gotten some time allocated to work on this next week.

@JonasToth It looks like there are some outstanding review comments that need to be addressed. I've gotten some time allocated to work on this next week.

Great!

I agree with @Eugene.Zelenko that this check should rather live in bugprone- as the issue its diagnosing is not LLVM specific. It is ok to add an alias into the llvm module.

Here are the warnings it issues with the patch's matcher:

clang/include/clang/StaticAnalyzer/Core/CheckerManager.h:625:43: warning: address of static local variable 'tag' may not be identical across library boundaries [llvm-problematic-statics]
clang/include/clang/StaticAnalyzer/Core/PathSensitive/RangedConstraintManager.h:219:1: warning: address of static local variable 'Index' may not be identical across library boundaries [llvm-problematic-statics]
clang/include/clang/StaticAnalyzer/Core/PathSensitive/SMTConstraintManager.h:25:1: warning: address of static local variable 'Index' may not be identical across library boundaries [llvm-problematic-statics]

And here are the additional warnings it issues when relaxing the matcher (to be unless(isExpansionInMainFile())):

clang/lib/StaticAnalyzer/Checkers/Iterator.h:130:47: warning: address of static local variable 'Index' may not be identical across library boundaries [llvm-problematic-statics]
clang/lib/StaticAnalyzer/Checkers/Iterator.h:136:47: warning: address of static local variable 'Index' may not be identical across library boundaries [llvm-problematic-statics]
clang/lib/StaticAnalyzer/Checkers/Iterator.h:142:47: warning: address of static local variable 'Index' may not be identical across library boundaries [llvm-problematic-statics]
clang/lib/StaticAnalyzer/Checkers/MPI-Checker/MPITypes.h:60:5: warning: address of static local variable 'index' may not be identical across library boundaries [llvm-problematic-statics]
llvm/utils/unittest/googletest/src/gtest.cc:3989:3: warning: address of static local variable 'instance' may not be identical across library boundaries [llvm-problematic-statics]

And also this one, which was hidden by default but made running tidy noisier by always saying there was a hidden warning:

/usr/bin/../include/c++/v1/functional:1960:9: warning: address of static local variable '__policy_' may not be identical across library boundaries [llvm-problematic-statics]

All of the first set of warnings might be actual problems, but the second set definitely aren't problems because they aren't going to be used across shared library boundaries. By making this be a LLVM-specific check, we can avoid the noise of these false positives. Also, I looked through the C++ Core Guidelines for any rules that might address this but didn't see anything relevant.

Thoughts?

I agree with @Eugene.Zelenko that this check should rather live in bugprone- as the issue its diagnosing is not LLVM specific. It is ok to add an alias into the llvm module.

Here are the warnings it issues with the patch's matcher:

clang/include/clang/StaticAnalyzer/Core/CheckerManager.h:625:43: warning: address of static local variable 'tag' may not be identical across library boundaries [llvm-problematic-statics]
clang/include/clang/StaticAnalyzer/Core/PathSensitive/RangedConstraintManager.h:219:1: warning: address of static local variable 'Index' may not be identical across library boundaries [llvm-problematic-statics]
clang/include/clang/StaticAnalyzer/Core/PathSensitive/SMTConstraintManager.h:25:1: warning: address of static local variable 'Index' may not be identical across library boundaries [llvm-problematic-statics]

And here are the additional warnings it issues when relaxing the matcher (to be unless(isExpansionInMainFile())):

clang/lib/StaticAnalyzer/Checkers/Iterator.h:130:47: warning: address of static local variable 'Index' may not be identical across library boundaries [llvm-problematic-statics]
clang/lib/StaticAnalyzer/Checkers/Iterator.h:136:47: warning: address of static local variable 'Index' may not be identical across library boundaries [llvm-problematic-statics]
clang/lib/StaticAnalyzer/Checkers/Iterator.h:142:47: warning: address of static local variable 'Index' may not be identical across library boundaries [llvm-problematic-statics]
clang/lib/StaticAnalyzer/Checkers/MPI-Checker/MPITypes.h:60:5: warning: address of static local variable 'index' may not be identical across library boundaries [llvm-problematic-statics]
llvm/utils/unittest/googletest/src/gtest.cc:3989:3: warning: address of static local variable 'instance' may not be identical across library boundaries [llvm-problematic-statics]

And also this one, which was hidden by default but made running tidy noisier by always saying there was a hidden warning:

/usr/bin/../include/c++/v1/functional:1960:9: warning: address of static local variable '__policy_' may not be identical across library boundaries [llvm-problematic-statics]

All of the first set of warnings might be actual problems, but the second set definitely aren't problems because they aren't going to be used across shared library boundaries. By making this be a LLVM-specific check, we can avoid the noise of these false positives. Also, I looked through the C++ Core Guidelines for any rules that might address this but didn't see anything relevant.

Thoughts?

Do you have data for other projects? As this is not a very common thing and probably different for code-bases with plugins and so on, the "chattiness" of the check would be interesting to know.
If the check is usually quiet, then i would think its ok to have it as a general check together with proper documentation explaining why these statics can be a problem.

I would still like to have it in bugprone-, because this is a real problem that can arise and the check is more likely to be ignored if considered to be "just for llvm".
The decision for true vs false positive is not possible to decide within clang-tidy, is it? I think this should then be left to the developer (as it is probably not that much?).

Do you have data for other projects? As this is not a very common thing and probably different for code-bases with plugins and so on, the "chattiness" of the check would be interesting to know.

I ran it on our large internal codebase. It generated one unique warning (2 total) with the filter on /include/. It was a false positive but at least it was actually complaining about something plugin-related. When loosening it up to any included file, that grows to five unique warnings (231 total) and all of the additional warnings were false positives.

If the check is usually quiet, then i would think its ok to have it as a general check together with proper documentation explaining why these statics can be a problem.

I think the fact that it's triggered in header files makes it inherently noisier than other checkers. The warnings in LLVM I pasted above were only the unique instances -- there were a total of 323 warnings issued for the first set and a total of 353 warnings issued when relaxing the restrictions.

I would still like to have it in bugprone-, because this is a real problem that can arise and the check is more likely to be ignored if considered to be "just for llvm".

I've updated the patch and moved it into bugprone-. I think it probably needs a better name than bugprone-problematic-statics, now that it's expected to be used outside of the LLVM context.

The decision for true vs false positive is not possible to decide within clang-tidy, is it? I think this should then be left to the developer (as it is probably not that much?).

It can't be decided completely. I think that this checker would be more useful with a configuration option to specify a regex for "plugin API headers", but I don't have the time to implement it.

Herald added a project: Restricted Project. · View Herald TranscriptThu, Oct 1, 7:26 AM
rsmith added a subscriber: rsmith.Thu, Oct 1, 10:36 PM
rsmith added inline comments.
clang-tools-extra/docs/clang-tidy/checks/bugprone-problematic-statics.rst
7–8

The C++ language rules do guarantee that such static locals have the same address throughout the entire program. Should this warning check the symbol visibility / dllexportedness? People are going to learn the wrong thing if we give them this diagnostic without any further explanation.

If the concern is plugins loaded with RTLD_LOCAL, sharing C++ interfaces across an RTLD_LOCAL boundary is pretty fundamentally broken, and this is far from the only way in which it goes wrong. It might be better to warn on all non-RTLD_GLOBAL calls to dlopen from C++ code...

I am confused now, but richard knows this stuff very well! The patches you referenced showed only cases where the static was defined in an inline-header definition (and templates). Does this make a difference on the semantics?
This should be clear before we continue.

I am confused now, but richard knows this stuff very well! The patches you referenced showed only cases where the static was defined in an inline-header definition (and templates). Does this make a difference on the semantics?
This should be clear before we continue.

Local static variables in inline functions and function templates are required by the C++ language semantics to result in exactly one variable in the whole program. In easy / typical cases, that works. However, this fails in at least the following cases:

  • Under the MS ABI, DLLs behave more like separate programs sharing an address space than like a single program, at least by default, so by default each DLL will see a different static local variable (and similarly for class statics and other vague linkage entities). Marking the class or function as __declspec(dllexport) / __declspec(dllimport) as appropriate would be sufficient to make the static variable behave properly if it appears on the DLL interface; if you don't mark it suitably, you broke the DLL import/export rules.
  • On POSIX systems, dlopen with RTLD_LOCAL (which is, unfortunately, the default for many systems), or linking a shared library with -Bsymbolic, can result in symbols being duplicated across the program. That's sort of fundamental -- that's what those options are supposed to do -- but they break C++ language semantics in various ways, this being one of them. (The idea that dlopen with RTLD_LOCAL creates a local copy of the specified shared library doesn't work, because -- due to C++'s heavy reliance on vague linkage definitions in interfaces -- the shared library will in general also contain symbols that are not part of that library, such as instantiations of someone else's templates or copies of someone else's local static variables, inline functions, type info, etc., and so you can also end up with local copies of those other things that aren't part of the shared library and that are expected to be unique across the program.) The solution is, don't do that -- it's not safe to use those options for shared libraries that either expose or use a vague linkage interface (which is more or less any modern C++ interface) from a different library and for which symbol uniqueness across the boundary matters (which is very hard to falsify in general).

I'm not sure whether we can locally distinguish the first case from harmless cases where the exposed state is not on a DLL boundary. We can detect instances of the second case by looking for dlopen calls in C++ code that use RTLD_LOCAL and warning on them (but we presumably can't detect -Bsymbolic from clang-tidy, because it's a linker flag).