Extend checker 'ErrnoModeling' with a state of 'errno' to indicate
the importance of the 'errno' value and how it should be used.
Add a new checker 'ErrnoChecker' that observes use of 'errno' and
finds possible wrong uses, based on the "errno state".
The "errno state" should be set (together with value of 'errno')
by other checkers (that perform modeling of the given function)
in the future. Currently only a test function can set this value.
The new checker has no user-observable effect yet.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
This checker is made to add a partial support for CERT rule ERR30-C . One part of the rule is "check errno only after the function returns a value indicating failure".
To make this check possible the function (one that sets errno in some way) should be modeled by another checker that knows when a failure-indication value is returned from the function. In (but only in) that case the function sets value of errno. Return value of the function call should be constrained by the modeling checker to the failure-indicating values if the errno value is set, otherwise to some other values (a state split is needed).
The new API allows to set the errno value only together with an "errno check state". This state indicates how to handle the errno value by the ErrnoChecker. This information is available at the modeling of the errno-setting function. The CERT rule specifies classes of functions, including "functions that set errno and return an out-of-band error indicator" and "set errno and return an in-band error indicator". At the out-of-band case the errno value is not required to be checked, failure can be observed by check of the return value. At the in-band case the return value at failure is a valid return value too, here errno must be checked to observe if the function has failed. This case is modeled by the Errno_MustBeChecked errno check state. At many functions value of errno may be undefined after the function call if the function has not failed (the function is not required to not change errno), this is modeled by the Errno_MustNotBeChecked value.
I'm liking it. We need to improve the diagnostics and the user-facing docs as well.
clang/include/clang/StaticAnalyzer/Checkers/Checkers.td | ||
---|---|---|
552 | Then we should have documentation, and examples for it. | |
clang/lib/StaticAnalyzer/Checkers/ErrnoChecker.cpp | ||
48–50 | One should not define member variables like this. | |
69–72 | Loc always wrap a memregion of some sort. | |
115–117 | I think an early return could have spared an indentation level in the outer scope. | |
199 | ||
204–207 | ||
clang/lib/StaticAnalyzer/Checkers/ErrnoModeling.cpp | ||
265–266 | ||
clang/lib/StaticAnalyzer/Checkers/ErrnoModeling.h | ||
24 | Did you consider enum classes? | |
42 | I would appreciate some notes about when it returns None. | |
clang/test/Analysis/errno.c | ||
73–75 | We should have a note describing which function left the errno in an undefined/indeterminate state. | |
170 | So this is the case when errno gets indirectly invalidated by some opaque function call. I see. |
clang/lib/StaticAnalyzer/Checkers/ErrnoChecker.cpp | ||
---|---|---|
48–50 | In this case it's probably fine given that the variable doesn't really change throughout a single ExplodedGraph construction. But I agree that it doesn't really make sense to cache this data; getErrnoLoc() an O(1) lookup anyway. | |
69–72 | Nope; null pointer is a Loc but doesn't correspond to any memory region. I usually print out doxygen inheritance graphs:
and hang them on the wall near my desk. They're really handy. On a separate note, why not make this assertion part of getErrnoLoc()? | |
94 | "May" is more neutral. | |
98 | This path-insensitive note is attached to the exact same source location as the warning, right? I think this should be part of the warning instead. Something like this: foo(); // (1) path note: Call to 'foo()' indicates failure only by setting 'errno' bar(); // (2) warning: Value of 'errno' potentially overwritten by 'bar()' was not checked | |
105 | Note that checkBeginFunction() is every time a nested stack frame is entered. This happens much more often than an update to the variable is needed. |
clang/lib/StaticAnalyzer/Checkers/ErrnoChecker.cpp | ||
---|---|---|
69–72 | ah true |
Address review comments.
- Removed caching of errno-related values from the checker.
- Added note tags.
- Added documentation comments.
clang/include/clang/StaticAnalyzer/Checkers/Checkers.td | ||
---|---|---|
552 | Yes this is missing. But currently the errno value is not set by any non-debug checker, if there is any real example in the documentation it will not work (until the function is modeled correctly). Adding errno support for the functions (or some of them) is a separate change, we can add documentation here and commit the changes together (as close as possible). | |
clang/lib/StaticAnalyzer/Checkers/ErrnoChecker.cpp | ||
69–72 | This assert is not needed, the errno value has always a place that is a MemRegion according to how the ErrnoModeling works. | |
105 | The "cached" errno values are removed, this function is not needed. | |
clang/lib/StaticAnalyzer/Checkers/ErrnoModeling.h | ||
24 | Originally it was enum class but I had to change it for some reason. Probably it did not have a default value. | |
clang/test/Analysis/errno.c | ||
170 | The check is now done indirectly by checking that no warning is produced. |
This errno check will work if a state split is added at every standard function that may set errno at failure. (The success and failure branches have different errno check state, and probably different return value for the function.) Probably this is too many state splits. But a correct program should have anyway checks for the return value that make the same state splits.
I think this is a good initiative and valuable work. You might be right about the too many state splits, however, we should measure this empirically. We should have a measurement on opensource projects to highlight peak memory usage and runtime discrepancies compared to the baseline. It might turn out that we indeed have to create a list of functions on which we are allowed to do the state split. Also, would be useful to evaluate the new reports.
D126801 should simplify the reinterpret casts.
clang/include/clang/StaticAnalyzer/Checkers/Checkers.td | ||
---|---|---|
558 | I think we have 'Hidden', but not 'Hide'. | |
clang/lib/StaticAnalyzer/Checkers/ErrnoModeling.cpp | ||
261–262 | It looks odd that in the very next if block you have II and here you don't. | |
clang/lib/StaticAnalyzer/Checkers/ErrnoModeling.h | ||
34 | You don't need to prefix these with Errno_. It's already contained by a specific namespace.
|
clang/include/clang/StaticAnalyzer/Checkers/Checkers.td | ||
---|---|---|
558 | This "Hide" applies for the option, not for the checker. It was copied from another place. I am not sure if it must be used here. |
I believe evalAssume would be a better fit for diagnosing such issues, but I can see your point that we don't have CheckerContext there to emit reports.
That being said, the check::Location is the best alternative.
Please also check the not done comments.
clang/docs/analyzer/checkers.rst | ||
---|---|---|
2529 | successful | |
2538–2539 | Maybe place a crossref for this. Also, am I right that there is a CERT rule about this issue? | |
clang/include/clang/StaticAnalyzer/Checkers/Checkers.td | ||
550 | antipatterns? | |
554 | My problem with this name is that Unconditional seems to bind to the action, not to the place where the read happens (inside condition expressions). How about calling this AllowReadingErrnoInsideConditionExpressions. Still bad, but probably better. | |
558 | Quote CheckerBase.td:39 /// Marks the entry hidden. Hidden entries won't be displayed in /// -analyzer-checker-option-help. class HiddenEnum<bit val> { bit Val = val; } def DontHide : HiddenEnum<0>; def Hide : HiddenEnum<1>; I believe, we should expose this option. Otherwise why would we introduce it in the first place? | |
clang/lib/StaticAnalyzer/Checkers/ErrnoChecker.cpp | ||
11–12 | and the StdLibraryFunctionsChecker per D125400. | |
102–103 | We should also handle BinaryOperator for conditional operators (EQ, NE, LT, LE, ...). Or at least leave a FIXME about this. | |
clang/lib/StaticAnalyzer/Checkers/ErrnoModeling.cpp | ||
255 | I believe you no longer need to cast enums to numbers and back for enums and other types. | |
274 |
clang/docs/analyzer/checkers.rst | ||
---|---|---|
2538–2539 | Probably there is not a preferred or "standard" site for man page lookup and it can be done from command line too. But I can include a POSIX link https://pubs.opengroup.org/onlinepubs/9699919799/. | |
clang/include/clang/StaticAnalyzer/Checkers/Checkers.td | ||
550 | I like "improper use of errno" better, it is similar to HelpText of other checkers. | |
clang/lib/StaticAnalyzer/Checkers/ErrnoChecker.cpp | ||
102–103 | The idea is here that this search does not stop at binary operators but goes up the AST until a condition is found. | |
clang/lib/StaticAnalyzer/Checkers/ErrnoModeling.cpp | ||
255 | How to handle if no value exists? | |
261–262 | I do not get what you mean here. | |
274 | I can not omit c_str, there is a compile error probably related to lambda return type. | |
clang/lib/StaticAnalyzer/Checkers/ErrnoModeling.h | ||
34 | Without Errno_ these sound too generic. Probably have only Errno prefix, not Errno_? |
It looks great.
Let me check the test coverage and the generated docs page it everything looks great.
clang/docs/analyzer/checkers.rst | ||
---|---|---|
2538 | ||
2540 | Please mention at least one of those functions. | |
clang/include/clang/StaticAnalyzer/Checkers/Checkers.td | ||
553–554 | Well, now I see why you marked this Hide previously. Interestingly, other checkers mark some options Hide and other times DontHide. | |
clang/lib/StaticAnalyzer/Checkers/ErrnoChecker.cpp | ||
47 | I don't really understand this comment. What is a non-condition part of a statement? | |
146 | Might be more readable to early return instead. | |
206 | Does this refer to the callsite or to the Decl location? | |
clang/lib/StaticAnalyzer/Checkers/ErrnoModeling.cpp | ||
261–262 | I was thinking about this: bool isErrno(const Decl *D) { if (const auto *VD = dyn_cast_or_null<VarDecl>(D)) if (const IdentifierInfo *II = VD->getIdentifier()) return II->getName() == ErrnoVarName; if (const auto *FD = dyn_cast_or_null<FunctionDecl>(D)) if (const IdentifierInfo *II = FD->getIdentifier()) return llvm::is_contained(ErrnoLocationFuncNames, II->getName()); return false; } This way both checks would follow the same pattern, symmetry. | |
274 | Yes, you will need to specify explicitly the return type of the lambda: -> std::string. | |
clang/lib/StaticAnalyzer/Checkers/ErrnoModeling.h | ||
34 | But you always refer to it by it's qualified name: errno_modeling::Errno_Irrelevant. |
clang/lib/StaticAnalyzer/Checkers/ErrnoModeling.cpp | ||
---|---|---|
73 | You will need this to get it compiled. |
Awesome!
The generated doc section looks great. The test coverage it excellent, but I would recommend adding tests for covering the following lines:
- ErrnoChecker.cpp:99
- ErrnoModeling.cpp:259
- ErrnoModeling.cpp:264
There are two other uncovered cases, but those are mainly defensive checks, so I don't mind them.
clang/docs/analyzer/checkers.rst | ||
---|---|---|
2565 | by | |
2586 | I think it should be surrounded by double backticks. It looks ugly this way: |
clang/include/clang/StaticAnalyzer/Checkers/Checkers.td | ||
---|---|---|
553–554 | It looks good now, ErrnoChecker is not a modeling checker and the option should be accessible for users. | |
clang/lib/StaticAnalyzer/Checkers/ErrnoChecker.cpp | ||
48 | Is it really missing? (I do not check Doxygen documentation, probably the new documentation appears in the online pages after the commit.) | |
101 | Yes: Even if it looks natural that getCond can be used the same way as at normal ConditionalOperator this is not true: It returns something other (casted or converted) than the root condition expressions. | |
146 | There is a else part. | |
206 | CallF is a FunctionDecl that should be the original (first) declaration of the function. |