This is an archive of the discontinued LLVM Phabricator instance.

[clang][analyzer] Add checker for bad use of 'errno'.
ClosedPublic

Authored by balazske on Mar 21 2022, 8:30 AM.

Details

Summary

Extend checker 'ErrnoModeling' with a state of 'errno' to indicate
the importance of the 'errno' value and how it should be used.
Add a new checker 'ErrnoChecker' that observes use of 'errno' and
finds possible wrong uses, based on the "errno state".
The "errno state" should be set (together with value of 'errno')
by other checkers (that perform modeling of the given function)
in the future. Currently only a test function can set this value.
The new checker has no user-observable effect yet.

Diff Detail

Event Timeline

balazske created this revision.Mar 21 2022, 8:30 AM
Herald added a project: Restricted Project. · View Herald Transcript
balazske requested review of this revision.Mar 21 2022, 8:30 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 21 2022, 8:30 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript

This checker is made to add a partial support for CERT rule ERR30-C . One part of the rule is "check errno only after the function returns a value indicating failure".

To make this check possible the function (one that sets errno in some way) should be modeled by another checker that knows when a failure-indication value is returned from the function. In (but only in) that case the function sets value of errno. Return value of the function call should be constrained by the modeling checker to the failure-indicating values if the errno value is set, otherwise to some other values (a state split is needed).

The new API allows to set the errno value only together with an "errno check state". This state indicates how to handle the errno value by the ErrnoChecker. This information is available at the modeling of the errno-setting function. The CERT rule specifies classes of functions, including "functions that set errno and return an out-of-band error indicator" and "set errno and return an in-band error indicator". At the out-of-band case the errno value is not required to be checked, failure can be observed by check of the return value. At the in-band case the return value at failure is a valid return value too, here errno must be checked to observe if the function has failed. This case is modeled by the Errno_MustBeChecked errno check state. At many functions value of errno may be undefined after the function call if the function has not failed (the function is not required to not change errno), this is modeled by the Errno_MustNotBeChecked value.

steakhal requested changes to this revision.Mar 22 2022, 5:56 AM

I'm liking it. We need to improve the diagnostics and the user-facing docs as well.

clang/include/clang/StaticAnalyzer/Checkers/Checkers.td
552

Then we should have documentation, and examples for it.

clang/lib/StaticAnalyzer/Checkers/ErrnoChecker.cpp
48–50

One should not define member variables like this.
The analysis engine might explore exploded nodes in an unpredictable order, invoking the checker's handler callbacks in an indeterminate order.
You should have registered simple value traits for this purpose, associating the necessary information with a given State.

69–72

Loc always wrap a memregion of some sort.

115–117

I think an early return could have spared an indentation level in the outer scope.

199
204–207
clang/lib/StaticAnalyzer/Checkers/ErrnoModeling.cpp
265–266
clang/lib/StaticAnalyzer/Checkers/ErrnoModeling.h
24

Did you consider enum classes?

42

I would appreciate some notes about when it returns None.

clang/test/Analysis/errno.c
73–75

We should have a note describing which function left the errno in an undefined/indeterminate state.
You can chain multiple NoteTags by using the ExplodedNode returned by the addTransition() and passing it along if needed.

170

So this is the case when errno gets indirectly invalidated by some opaque function call. I see.
However, it will test that the value associated with the errno region gets invalidated, that's fine. However, you should test if the metadata (the enum value) attached to memregion also gets invalidated.
Please make sure you have tests for that as well.
A ErrnoTesterChecker_dumpCheckState() should be perfect for this.

This revision now requires changes to proceed.Mar 22 2022, 5:56 AM
NoQ added inline comments.Mar 22 2022, 10:48 AM
clang/lib/StaticAnalyzer/Checkers/ErrnoChecker.cpp
48–50

In this case it's probably fine given that the variable doesn't really change throughout a single ExplodedGraph construction. But I agree that it doesn't really make sense to cache this data; getErrnoLoc() an O(1) lookup anyway.

69–72

Nope; null pointer is a Loc but doesn't correspond to any memory region.

I usually print out doxygen inheritance graphs:

and hang them on the wall near my desk. They're really handy.

On a separate note, why not make this assertion part of getErrnoLoc()?

94

"May" is more neutral.

98

This path-insensitive note is attached to the exact same source location as the warning, right? I think this should be part of the warning instead. Something like this:

foo(); // (1) path note: Call to 'foo()' indicates failure only by setting 'errno'
bar(); // (2) warning: Value of 'errno' potentially overwritten by 'bar()' was not checked
105

Note that checkBeginFunction() is every time a nested stack frame is entered. This happens much more often than an update to the variable is needed.

steakhal added inline comments.Mar 22 2022, 11:31 AM
clang/lib/StaticAnalyzer/Checkers/ErrnoChecker.cpp
69–72

ah true

balazske updated this revision to Diff 418238.Mar 25 2022, 8:34 AM
balazske marked 11 inline comments as done.Mar 25 2022, 8:34 AM

Address review comments.

  • Removed caching of errno-related values from the checker.
  • Added note tags.
  • Added documentation comments.
balazske marked 2 inline comments as done.Mar 25 2022, 8:37 AM
balazske added inline comments.
clang/include/clang/StaticAnalyzer/Checkers/Checkers.td
552

Yes this is missing. But currently the errno value is not set by any non-debug checker, if there is any real example in the documentation it will not work (until the function is modeled correctly). Adding errno support for the functions (or some of them) is a separate change, we can add documentation here and commit the changes together (as close as possible).

clang/lib/StaticAnalyzer/Checkers/ErrnoChecker.cpp
69–72

This assert is not needed, the errno value has always a place that is a MemRegion according to how the ErrnoModeling works.

105

The "cached" errno values are removed, this function is not needed.

clang/lib/StaticAnalyzer/Checkers/ErrnoModeling.h
24

Originally it was enum class but I had to change it for some reason. Probably it did not have a default value.

clang/test/Analysis/errno.c
170

The check is now done indirectly by checking that no warning is produced.

This errno check will work if a state split is added at every standard function that may set errno at failure. (The success and failure branches have different errno check state, and probably different return value for the function.) Probably this is too many state splits. But a correct program should have anyway checks for the return value that make the same state splits.

This errno check will work if a state split is added at every standard function that may set errno at failure. (The success and failure branches have different errno check state, and probably different return value for the function.) Probably this is too many state splits. But a correct program should have anyway checks for the return value that make the same state splits.

I think this is a good initiative and valuable work. You might be right about the too many state splits, however, we should measure this empirically. We should have a measurement on opensource projects to highlight peak memory usage and runtime discrepancies compared to the baseline. It might turn out that we indeed have to create a list of functions on which we are allowed to do the state split. Also, would be useful to evaluate the new reports.

balazske updated this revision to Diff 432929.May 30 2022, 8:16 AM

added option to allow read-only of errno
other small change in the test checker

balazske updated this revision to Diff 433046.May 31 2022, 3:53 AM

rebase, fix of failing tests

martong added inline comments.Jun 1 2022, 8:24 AM
clang/include/clang/StaticAnalyzer/Checkers/Checkers.td
554

I think this name would be much more easier to read.

clang/lib/StaticAnalyzer/Checkers/ErrnoChecker.cpp
47

Or in a statement without a condition ?

68

Please add a documentation to this.

D126801 should simplify the reinterpret casts.

clang/include/clang/StaticAnalyzer/Checkers/Checkers.td
558

I think we have 'Hidden', but not 'Hide'.
It means that it is a modeling checker. Any checker that emits diagnostics, should not be marked 'Hidden'.

clang/lib/StaticAnalyzer/Checkers/ErrnoModeling.cpp
261–262

It looks odd that in the very next if block you have II and here you don't.

clang/lib/StaticAnalyzer/Checkers/ErrnoModeling.h
34

You don't need to prefix these with Errno_. It's already contained by a specific namespace.
https://llvm.org/docs/CodingStandards.html#name-types-functions-variables-and-enumerators-properly

Unless the enumerators are defined in their own small namespace or inside a class, enumerators should have a prefix corresponding to the enum declaration name

balazske updated this revision to Diff 435081.Jun 8 2022, 2:33 AM
balazske marked 5 inline comments as done.

applied review comments, test improvement

balazske added inline comments.Jun 8 2022, 2:36 AM
clang/include/clang/StaticAnalyzer/Checkers/Checkers.td
558

This "Hide" applies for the option, not for the checker. It was copied from another place. I am not sure if it must be used here.

martong added inline comments.Jun 9 2022, 3:24 AM
clang/docs/analyzer/checkers.rst
2553–2555

This sentence is a bit hard to follow, I'd split this up.

2559–2560

"all times" and "the first read" seems to be in contradiction to each other. Could you please rephrase? Use a standalone sentence that explains the "(at the first read)" part.

2560

?

balazske updated this revision to Diff 435557.Jun 9 2022, 8:15 AM
balazske marked 3 inline comments as done.

fixed documentation, added release note

martong accepted this revision.Jun 9 2022, 8:41 AM

LGTM, it is a good start for an alpha checker.

clang/docs/ReleaseNotes.rst
585

I believe evalAssume would be a better fit for diagnosing such issues, but I can see your point that we don't have CheckerContext there to emit reports.
That being said, the check::Location is the best alternative.

Please also check the not done comments.

clang/docs/analyzer/checkers.rst
2529

successful

2538–2539

Maybe place a crossref for this.

Also, am I right that there is a CERT rule about this issue?
We should probably mention that, while placing a crossref too.

clang/include/clang/StaticAnalyzer/Checkers/Checkers.td
550

antipatterns?

554

My problem with this name is that Unconditional seems to bind to the action, not to the place where the read happens (inside condition expressions).
I find this pretty confusing.

How about calling this AllowReadingErrnoInsideConditionExpressions. Still bad, but probably better.

558

Quote CheckerBase.td:39

/// Marks the entry hidden. Hidden entries won't be displayed in
/// -analyzer-checker-option-help.
class HiddenEnum<bit val> {
  bit Val = val;
}
def DontHide : HiddenEnum<0>;
def Hide : HiddenEnum<1>;

I believe, we should expose this option. Otherwise why would we introduce it in the first place?
The default value of this is actually DontHide in the CmdLineOption<> td class. So, you should be fine by simply omitting this keyword here.

clang/lib/StaticAnalyzer/Checkers/ErrnoChecker.cpp
11–12

and the StdLibraryFunctionsChecker per D125400.

102–103

We should also handle BinaryOperator for conditional operators (EQ, NE, LT, LE, ...). Or at least leave a FIXME about this.
Have a test demonstrating the behavior.

clang/lib/StaticAnalyzer/Checkers/ErrnoModeling.cpp
255

I believe you no longer need to cast enums to numbers and back for enums and other types.
get<ErrnoState>() should return the right type.
See D126801.

274
balazske updated this revision to Diff 437471.Jun 16 2022, 1:31 AM
balazske marked 7 inline comments as done.

applied review comments

balazske added inline comments.Jun 16 2022, 1:39 AM
clang/docs/analyzer/checkers.rst
2538–2539

Probably there is not a preferred or "standard" site for man page lookup and it can be done from command line too. But I can include a POSIX link https://pubs.opengroup.org/onlinepubs/9699919799/.

clang/include/clang/StaticAnalyzer/Checkers/Checkers.td
550

I like "improper use of errno" better, it is similar to HelpText of other checkers.

clang/lib/StaticAnalyzer/Checkers/ErrnoChecker.cpp
102–103

The idea is here that this search does not stop at binary operators but goes up the AST until a condition is found.

clang/lib/StaticAnalyzer/Checkers/ErrnoModeling.cpp
255

How to handle if no value exists?

261–262

I do not get what you mean here.

274

I can not omit c_str, there is a compile error probably related to lambda return type.

clang/lib/StaticAnalyzer/Checkers/ErrnoModeling.h
34

Without Errno_ these sound too generic. Probably have only Errno prefix, not Errno_?

balazske updated this revision to Diff 437513.Jun 16 2022, 5:36 AM

Use typed state trait.

It looks great.
Let me check the test coverage and the generated docs page it everything looks great.

clang/docs/analyzer/checkers.rst
2538
2540

Please mention at least one of those functions.

clang/include/clang/StaticAnalyzer/Checkers/Checkers.td
553–554

Well, now I see why you marked this Hide previously.
This is a modeling checker, thus it won't appear in the documentation nor in the -analyzer-checker-help etc.

Interestingly, other checkers mark some options Hide and other times DontHide.
E.g. DynamicMemoryModeling.Optimistic is displayed, but the DynamicMemoryModeling.AddNoOwnershipChangeNotes is hidden.
Whatever.

clang/lib/StaticAnalyzer/Checkers/ErrnoChecker.cpp
47

I don't really understand this comment. What is a non-condition part of a statement?

146

Might be more readable to early return instead.

206

Does this refer to the callsite or to the Decl location?
I think the former, which would be a bug I think.

clang/lib/StaticAnalyzer/Checkers/ErrnoModeling.cpp
261–262

I was thinking about this:

bool isErrno(const Decl *D) {
  if (const auto *VD = dyn_cast_or_null<VarDecl>(D))
    if (const IdentifierInfo *II = VD->getIdentifier())
      return II->getName() == ErrnoVarName;
  if (const auto *FD = dyn_cast_or_null<FunctionDecl>(D))
    if (const IdentifierInfo *II = FD->getIdentifier())
      return llvm::is_contained(ErrnoLocationFuncNames, II->getName());
  return false;
}

This way both checks would follow the same pattern, symmetry.

274

Yes, you will need to specify explicitly the return type of the lambda: -> std::string.

clang/lib/StaticAnalyzer/Checkers/ErrnoModeling.h
34

But you always refer to it by it's qualified name: errno_modeling::Errno_Irrelevant.
In which case this prefix is just noise IMO. That's what I'm about.

steakhal added inline comments.Jun 16 2022, 6:36 AM
clang/lib/StaticAnalyzer/Checkers/ErrnoModeling.cpp
73

You will need this to get it compiled.

Awesome!

The generated doc section looks great. The test coverage it excellent, but I would recommend adding tests for covering the following lines:

  • ErrnoChecker.cpp:99
  • ErrnoModeling.cpp:259
  • ErrnoModeling.cpp:264

There are two other uncovered cases, but those are mainly defensive checks, so I don't mind them.

clang/docs/analyzer/checkers.rst
2565

by

2586

I think it should be surrounded by double backticks. It looks ugly this way:

balazske updated this revision to Diff 437922.Jun 17 2022, 8:36 AM
balazske marked 16 inline comments as done.

Removed Errno_ prefix from ErrnoCheckState, documentation fixes, other small fixes.

steakhal accepted this revision.Jun 19 2022, 12:57 AM
steakhal added inline comments.
clang/lib/StaticAnalyzer/Checkers/ErrnoChecker.cpp
48

You prefixed the previous two keywords with the \c marker. Why is it missing from the switch?

101

Oh, so the test added for this actually uncovered a bug?

This revision is now accepted and ready to land.Jun 19 2022, 12:57 AM
balazske marked an inline comment as done.Jun 20 2022, 12:41 AM
balazske added inline comments.
clang/include/clang/StaticAnalyzer/Checkers/Checkers.td
553–554

It looks good now, ErrnoChecker is not a modeling checker and the option should be accessible for users.

clang/lib/StaticAnalyzer/Checkers/ErrnoChecker.cpp
48

Is it really missing? (I do not check Doxygen documentation, probably the new documentation appears in the online pages after the commit.)

101

Yes: Even if it looks natural that getCond can be used the same way as at normal ConditionalOperator this is not true: It returns something other (casted or converted) than the root condition expressions.

146

There is a else part.

206

CallF is a FunctionDecl that should be the original (first) declaration of the function.

This revision was landed with ongoing or failed builds.Jun 20 2022, 1:08 AM
This revision was automatically updated to reflect the committed changes.
balazske marked an inline comment as done.