Page MenuHomePhabricator

[clang][analyzer] Add stream functions to StdLibraryFunctionsChecker.
AbandonedPublic

Authored by balazske on Oct 5 2022, 1:16 AM.

Details

Summary

Some of the stream handling functions are modeled in StreamChecker.
These are now added to StdLibraryFunctionsChecker where additional
checks are performed on these functions. This includes setting of
'errno' related state. In this way the errno state is set for these
functions in StdLibraryFunctionsChecker, and many of the constraints
of the stream functions are checked at that place.
The goal is that StreamChecker will check for stream-state related
bugs, the generic argument and errno constraints that can be described
in StdLibraryFunctionsChecker are to be checked only there.

Diff Detail

Event Timeline

balazske created this revision.Oct 5 2022, 1:16 AM
Herald added a reviewer: NoQ. · View Herald Transcript
Herald added a project: Restricted Project. · View Herald Transcript
balazske requested review of this revision.Oct 5 2022, 1:16 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 5 2022, 1:16 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript

This is another approach to D132017. This eliminates redundant code from StreamChecker that can be done with StdLibraryFunctionsChecker too. Something to fix is to somehow mention the dependency of apiModeling.StdCLibraryFunctions and ModelPOSIX=true option. If StreamChecker is used without these dependencies the errno part of the modeling can be incorrect (errno is not changed by the evalCall in StreamChecker, it is expected to be updated in the other checker in postCall).

balazske updated this revision to Diff 465423.Oct 5 2022, 9:00 AM

Fixed failing tests.

The two checkers work together and 'apiModeling.StdCLibraryFunctions'
and its 'ModelPOSIX=true' option should be now a dependency of
checker 'alpha.unix.Stream'.

Is there a way to stop the analysis with an error if "ModelPOSIX=true" is not set but the 'alpha.unix.Stream' checker is enabled? @Szelethus, maybe you have some insights with this?

Thanks for patch and the many test cases.

You still haven't answered my below concern:

Is there a way to stop the analysis with an error if "ModelPOSIX=true" is not set but the 'alpha.unix.Stream' checker is enabled?

clang/lib/StaticAnalyzer/Checkers/StdLibraryFunctionsChecker.cpp
1075

Please remove this line (leftover of some debugging?).

1088–1092

Why do we need this change?

1766–1839

Very good!

clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
603–613

This is redundant with the summary in the StdLibraryFunctionsChecker. Why do we need this as well?

Probably if the StdLibraryFunctionsChecker object can be get from StreamChecker (a new function is needed) it is possible to check the option and use CheckerManager::reportInvalidCheckerOptionValue (for the StdLibraryFunctionsChecker). But not sure if it does what we want here.

clang/lib/StaticAnalyzer/Checkers/StdLibraryFunctionsChecker.cpp
1088–1092

It is possible that only the errno related state is changed, no new constraints are added (if the constraint is already here from evalCall but the errno was not set there, for example at fclose or other stream functions maybe no new state is created here). In such case the note tag is still needed.

clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
603–613

It is probably needed to have a (any) bound value to the fclose function call, otherwise setting constraints in the other checker do not work. It may work to bind only a conjured value but it looks better if the correct return value is used. This makes less inter-dependence between the two checkers (StdLibraryFunctionChecker sets only the errno state as far as possible).

If StdCLibraryFunctionsChecker is added as dependency to StreamChecker and no other thing is changed these tests fail, and I could not find out the reason:
Errors in test stream.c:

error: 'warning' diagnostics expected but not seen: 
  File /local/clang/llvm2/llvm-project/clang/test/Analysis/stream.c Line 7: Stream pointer might be NULL
  File /local/clang/llvm2/llvm-project/clang/test/Analysis/stream.c Line 13: Stream pointer might be NULL
  File /local/clang/llvm2/llvm-project/clang/test/Analysis/stream.c Line 76: Stream pointer might be NULL

Errors in test stream-error.c:

error: 'warning' diagnostics expected but not seen: 
  File /local/clang/llvm2/llvm-project/clang/test/Analysis/stream-error.c Line 97: might be 'indeterminate'
  File /local/clang/llvm2/llvm-project/clang/test/Analysis/stream-error.c Line 100: Stream might be already closed
  File /local/clang/llvm2/llvm-project/clang/test/Analysis/stream-error.c Line 113: Read function called when stream is in EOF state
  File /local/clang/llvm2/llvm-project/clang/test/Analysis/stream-error.c Line 114: Read function called when stream is in EOF state
  File /local/clang/llvm2/llvm-project/clang/test/Analysis/stream-error.c Line 123: FALSE
  File /local/clang/llvm2/llvm-project/clang/test/Analysis/stream-error.c Line 124: FALSE
  File /local/clang/llvm2/llvm-project/clang/test/Analysis/stream-error.c Line 128: FALSE
  File /local/clang/llvm2/llvm-project/clang/test/Analysis/stream-error.c Line 129: TRUE
  File /local/clang/llvm2/llvm-project/clang/test/Analysis/stream-error.c Line 133: TRUE
  File /local/clang/llvm2/llvm-project/clang/test/Analysis/stream-error.c Line 134: FALSE

Probably the StdLibraryFunctionsChecker runs before the StreamChecker, and runs always after it if no dependency is there? But this does not explain all test errors and should cause no errors at all. Adding the dependency itself is not enough, ModelPOSIX option should be true too. If the option is set to true in the test file even more test errors appear, and it does not work even when the (StdLibraryFunctionsChecker) checker is added (to the RUN command). Without the dependency the tests do not fail if the order of checkers in the enabled checkers list is changed.

If StdCLibraryFunctionsChecker is added as dependency to StreamChecker and no other thing is changed these tests fail, and I could not find out the reason:
Errors in test stream.c:

error: 'warning' diagnostics expected but not seen: 
  File /local/clang/llvm2/llvm-project/clang/test/Analysis/stream.c Line 7: Stream pointer might be NULL
  File /local/clang/llvm2/llvm-project/clang/test/Analysis/stream.c Line 13: Stream pointer might be NULL
  File /local/clang/llvm2/llvm-project/clang/test/Analysis/stream.c Line 76: Stream pointer might be NULL

Errors in test stream-error.c:

error: 'warning' diagnostics expected but not seen: 
  File /local/clang/llvm2/llvm-project/clang/test/Analysis/stream-error.c Line 97: might be 'indeterminate'
  File /local/clang/llvm2/llvm-project/clang/test/Analysis/stream-error.c Line 100: Stream might be already closed
  File /local/clang/llvm2/llvm-project/clang/test/Analysis/stream-error.c Line 113: Read function called when stream is in EOF state
  File /local/clang/llvm2/llvm-project/clang/test/Analysis/stream-error.c Line 114: Read function called when stream is in EOF state
  File /local/clang/llvm2/llvm-project/clang/test/Analysis/stream-error.c Line 123: FALSE
  File /local/clang/llvm2/llvm-project/clang/test/Analysis/stream-error.c Line 124: FALSE
  File /local/clang/llvm2/llvm-project/clang/test/Analysis/stream-error.c Line 128: FALSE
  File /local/clang/llvm2/llvm-project/clang/test/Analysis/stream-error.c Line 129: TRUE
  File /local/clang/llvm2/llvm-project/clang/test/Analysis/stream-error.c Line 133: TRUE
  File /local/clang/llvm2/llvm-project/clang/test/Analysis/stream-error.c Line 134: FALSE

Probably the StdLibraryFunctionsChecker runs before the StreamChecker, and runs always after it if no dependency is there? But this does not explain all test errors and should cause no errors at all. Adding the dependency itself is not enough, ModelPOSIX option should be true too. If the option is set to true in the test file even more test errors appear, and it does not work even when the (StdLibraryFunctionsChecker) checker is added (to the RUN command). Without the dependency the tests do not fail if the order of checkers in the enabled checkers list is changed.

Okay, this is bad. We must understand the reasons behind these failures. It is very strange that the ModelPOSIX option triggers these failures. I think we have to hunt down and examine each failing check, could you please continue the debugging of these?

clang/lib/StaticAnalyzer/Checkers/StdLibraryFunctionsChecker.cpp
1088–1092

Okay, please add that as a comment to this new hunk.

clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
603–613

Ok, makes sense.

balazske updated this revision to Diff 474556.Nov 10 2022, 8:37 AM

Rebase to main and D137722.

balazske edited the summary of this revision. (Show Details)Nov 10 2022, 8:57 AM
This comment was removed by Szelethus.
clang/lib/StaticAnalyzer/Checkers/StdLibraryFunctionsChecker.cpp
787–788

Hmm, do we need to specify this? Can't this be the default?

balazske added inline comments.Dec 6 2022, 7:08 AM
clang/lib/StaticAnalyzer/Checkers/StdLibraryFunctionsChecker.cpp
787–788

This is only used in cases where errno should not change (previous value remains, no invalidation happens). This works if another checker has evalCall for the function and does not change errno (otherwise a default evaluation occurs that invalidates the system region of errno).
The StreamChecker is an example with some functions added in the next patch. This class is really not used in this patch, could be moved to the next.

Sorry abour my previous reply, I messed up the thread I was replying to. I better see what is going on.

Do you have a better handle on @martong's previous comment (D135247#3867603)? Do we know why this strange behaviour occurs with StreamChecker? I assumed that the stream related modeling is done in parallel to that. Is it about the return value?

Probably if the StdLibraryFunctionsChecker object can be get from StreamChecker (a new function is needed) it is possible to check the option and use CheckerManager::reportInvalidCheckerOptionValue (for the StdLibraryFunctionsChecker). But not sure if it does what we want here.

You can only retrieve a checker object if you can access the class definition of it (as you are aware), so unless you move StdLibraryFunctionsChecker to a header file, which doesn't seem like the correct option now (kinda goes against the independent module architecture (despite us talking about intertwining these "independent" modules)).

For me, the right course seems to be to detach this option from StdLibraryFunctionsChecker, and make it an analyzer level option. If this option so severely impacts both checkers, it feels more like an analyzer level option anyways. That will enable you to query the option from shouldRegisterStreamChecker. Whether you'd prefer to hard error or only warn about the failure to enable a checker is something I also struggle with, but you could start out with the hard error first.

The "strange" test failures that showed up earlier were probably caused by a bug that is fixed in the D137722. I just read that this patch is rebased to D137722 too, fixed the dependency stack. There was another problem with circular dependencies (because StdCLibraryFunctionArgsChecker had a dependency to StreamChecker, this is removed in the last patch). The checker option must be not a problem, the checker (StdLibraryFunctionsChecker) can be disabled (but is normally not because it is an apiModeling checker) or the ModelPOSIX option turned off independently if StreamChecker is enabled or not. The goal (should work at the end of this patch stack) is that StreamChecker can report all bugs that it can find, and there is no case when both checkers report a bug (in different way). If ModelPOSIX is turned off and StreamChecker is enabled, for fseek for example no bug is found if stream is NULL, and value of errno is just invalidated in all cases (like it works if StreamChecker is disabled too), but the stream state and file position is still checked by StreamChecker for all functions.

The patch looks OK now, I'll get to inspecting the others.

The "strange" test failures that showed up earlier were probably caused by a bug that is fixed in the D137722. I just read that this patch is rebased to D137722 too, fixed the dependency stack.

Very well!

There was another problem with circular dependencies (because StdCLibraryFunctionArgsChecker had a dependency to StreamChecker, this is removed in the last patch). The checker option must be not a problem, the checker (StdLibraryFunctionsChecker) can be disabled (but is normally not because it is an apiModeling checker) or the ModelPOSIX option turned off independently if StreamChecker is enabled or not.

Okay, so the checker behaves OK if StdLibraryFunctionsChecker is disabled. As long as it doesn't crash, this is fine, you shouldn't disable it in practice anyways!

The goal (should work at the end of this patch stack) is that StreamChecker can report all bugs that it can find, and there is no case when both checkers report a bug (in different way). If ModelPOSIX is turned off and StreamChecker is enabled, for fseek for example no bug is found if stream is NULL, and value of errno is just invalidated in all cases (like it works if StreamChecker is disabled too), but the stream state and file position is still checked by StreamChecker for all functions.

This sounds reasonable. It means that no parts of StdLibraryFunctionsChecker (including its option) is a "hard" dependency.

NoQ added a comment.Dec 13 2022, 2:53 PM

I like where this is going. Is there a plan to enable ModelPOSIX by default, at least on POSIX-compliant target triples?

I'm somewhat worried that some users may find null dereference warnings on fopen() more annoying than useful (though they're definitely more valuable than, say, a null dereference warning over malloc() which can technically fail but in practice it just doesn't). In any case, these warnings will need to be properly explained with notes, otherwise it's very likely that reports are going to be hard to understand.

NoQ added a comment.Dec 13 2022, 2:56 PM

Also, similarly to getenv(), in these cases domain-specific knowledge may help suppress some unwanted reports. Eg., if a file has been opened successfully, this doesn't technically mean that it'll be open successfully again, but it makes it more likely, and the user does not necessarily care about time-of-check-time-of-use races. So maybe it'd make sense to eventually move some of that stuff to StreamChecker anyway. Maybe not, hard to tell, need to see the results.

Is there a plan to enable ModelPOSIX by default, at least on POSIX-compliant target triples?

Yes, this would be more convenient. But all other modeled functions (not only in the ModelPOSIX part of the code) must be checked if they are POSIX compliant. The functions where there is difference between POSIX and non-POSIX (C standard only) can be added to the summary list in both cases with different summary. (The C standard draft specifies much less about how errno can be used.)

balazske marked 3 inline comments as done.Dec 15 2022, 2:38 AM

Also, similarly to getenv(), in these cases domain-specific knowledge may help suppress some unwanted reports. Eg., if a file has been opened successfully, this doesn't technically mean that it'll be open successfully again, but it makes it more likely, and the user does not necessarily care about time-of-check-time-of-use races. So maybe it'd make sense to eventually move some of that stuff to StreamChecker anyway. Maybe not, hard to tell, need to see the results.

This knowledge of "call history" can be implemented in an other checker, for the stream functions in StreamChecker, for getenv in a probably new checker (where the variable name could be stored). This StdLibraryFunctionChecker does not create the branch if the conditions (constraints) of a branch (summary case) are not satisfied. If another checker added branches in evalCall (for a success and failure case or only one of them) these are "selected" here only, not added.

clang/lib/StaticAnalyzer/Checkers/StdLibraryFunctionsChecker.cpp
1088–1092

The add of NoteTags could be improved. Probably a NoteTag should be displayed here if the return value (the "function call itself") is interesting. A text message should be specified for every function and the errno-related part added to it programatically if needed (if errno is interesting).

balazske updated this revision to Diff 483106.Dec 15 2022, 2:39 AM
balazske marked an inline comment as done.
balazske edited the summary of this revision. (Show Details)

Rebase to current main, fix of comment issues.

balazske updated this revision to Diff 483213.Dec 15 2022, 9:06 AM

Removed change of errno state from StreamChecker.
Added cases for zero buffer size in fread and fwrite.

balazske updated this revision to Diff 483446.Dec 16 2022, 12:25 AM

Updated test in errno-noopen.c.
The functions feof and ferror reset errno, the previous test was incorrect in this way.

balazske abandoned this revision.Dec 20 2022, 8:52 AM