This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
clang/lib/StaticAnalyzer/Checkers/
-
lib/
-
StaticAnalyzer/
-
Checkers/
32/63
StreamChecker.cpp

Differential D75682

[Analyzer][StreamChecker] Introduction of stream error handling.
ClosedPublic

Authored by balazske on Mar 5 2020, 7:29 AM.

Download Raw Diff

Details

Reviewers

Szelethus
NoQ
Charusso
baloghadamsoftware
xazax.hun

Commits

rG11bd3e5c6549: [Analyzer][StreamChecker] Introduction of stream error handling.

Summary

A new field is added to the stream state to store the actual error.
Function fseek is implemented to set the error state.
(More are to be added later.)
Modeling of stream error state handling functions is added.

(clearerr is not done yet but should follow.)

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

balazske created this revision.Mar 5 2020, 7:29 AM

Herald added a reviewer: Szelethus. · View Herald TranscriptMar 5 2020, 7:29 AM

Herald added a project: Restricted Project. · View Herald Transcript

Herald added subscribers: cfe-commits, martong, Charusso and 9 others. · View Herald Transcript

For older discussion see this:
https://reviews.llvm.org/D75356

Functions feof and ferror are added to show how the stream error flags are used and to make tests possible.

Some explanation of the error states:

EofError indicates an EOF condition in the stream.
OtherError indicates a generic (I/O or other but not EOF) error.
AnyError is a "placeholder" if the exact error kind is not important. This is a "Schrödinger's cat" type of state: If we need to observe the exact error it can change to Eof or Other error. (See code in evalFeof and evalFerror, probably this is the only place for this event.) This error kind is used to save state splits, other solution is to make for example after fseek a new state for EofError and another for OtherError. The file error functions are relatively seldomly used (?), because checking success of an operation can be done with the return value and there is no need to check ferror or feof. And for analyzer warnings it is often enough to know if any error happened, not if exactly EOF or other error.

We can not make a warning if a stream operation is called in failed state: The error was probably handled based on the return value of the previous failed operation. This thing is checked by the "ErrorReturnChecker" (https://reviews.llvm.org/D72705). Only some special functions like getc are to be handled specially where the error state has importance.

Harbormaster completed remote builds in B48199: Diff 248478.Mar 5 2020, 8:13 AM

balazske added a parent revision: D75614: [Analyzer][StreamChecker] Check for opened stream before operations..Mar 6 2020, 12:48 AM

Let's halt this for now -- we have so many revisions going on seemingly doing the same thing, its becoming really confusing. Please finish D75356 before proceeding.

This revision now requires changes to proceed.Mar 6 2020, 6:47 AM

balazske mentioned this in D75356: [Analyzer][StreamChecker] Introduction of stream error state handling..Mar 6 2020, 7:12 AM

balazske added reviewers: NoQ, Charusso, baloghadamsoftware, xazax.hun.Mar 6 2020, 7:24 AM

Herald added a subscriber: rnkovacs. · View Herald TranscriptMar 6 2020, 7:24 AM

Removed support for fseek, added clearerr.
This is now error handling only without fseek.
No functions set the error flags so test for
clearerr is not possible.

Harbormaster failed remote builds in B48540: Diff 249085!Mar 9 2020, 7:30 AM

Fixed lint warnings.

balazske added a child revision: D75851: [Analyzer][StreamChecker] Added evaluation of fseek..Mar 9 2020, 8:42 AM

Harbormaster completed remote builds in B48556: Diff 249107.Mar 9 2020, 9:08 AM

Can we see more test cases for when after a call to feof we indeed can tell the stream is in an EOF state? Same for ferror. I feel like there is a lot of code we don't yet cover.

Also, I see now that I didn't get in earlier revisions that OtherError actually refers to ferror() -- sorry about the confusion :)

clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
72–84	Hmm, now that I think of it, could we just merge these 2 enums? Also, I fear that indexers would accidentally assign the comment to the enum after the comma: Opened, /// Stream is opened. Closed, /// Closed stream (an invalid stream pointer after it was closed). OpenFailed /// The last open operation has failed. `/// Stream is opened` might be assigned to `Closed`. How about this: /// Stream is opened. Opened, /// Closed stream (an invalid stream pointer after it was closed). Closed, /// The last open operation has failed. OpenFailed
82	Shouldn't we call this `FError` instead, if this is set precisely when `ferror()` is true? Despite the comments, I still managed to get myself confused with it :)
83	When do we know that the stream is in an error state, but not precisely know what that error is within the context of this patch? `fseek` would indeed introduce such a state, as described in D75356#inline-689287, but that should introduce its own error `ErrorKind`. I still don't really get why we would ever need `AnyError`.
92	Same, shouldn't this be `isFError()`?
93	This is confusing -- I would expect a method called `isAnyError()` to return true when `isNoError()` is false.
369	What if we call `clearerr()` on a stream that is in an `feof()` state? Shouln't we return if the stream is `!isOtherError()` (or `!isFError()`, if we were to rename it)?
397	Does this ever run? We never actually set the stream state to `AnyError`.
418–419	This function is practically the same as `evalFeof`, can we merge them?

Szelethus requested changes to this revision.Mar 10 2020, 6:52 AM

This revision now requires changes to proceed.Mar 10 2020, 6:52 AM

The problem here with tests is that there is no function to call that sets the error flags FEof or FError. So it is only possible to check if these are not set after opening the file.

clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
72–84	Probably these can be merged, it is used for a stream that is in an indeterminate state after failed `freopen`, but practically it is handled the same way as a closed stream. But this change would be done in another revision.
82	Plan is to rename to `NoError`, `FEofError`, `FErrorError`, `FEofOrFErrorError`.
83	This is the problem is the change is too small: The next part of it introduces functions where the new error flags are used. In this patch the AnyError feature is not used (the feof and ferror read it but nothing sets it).
93	The error state kinds are not mutually exclusive. The "any error" means we know that there is some error but we do not know what the error is (it is not relevant because nothing was done that depends on it). If a function is called that needs to know if "ferror" or "feof" is the error, the state will be split to `FErrorError` and `FEofError`.
369	`clearerr` does not return any value. It only clears the error flags (sets to false). In my interpretation the stream has one error value associated with it, this value may be EOF or "other" error, or no error. There are not two error flags, only one kind of error that may be EOF or other.
418–419	It is not the same code ("EOF" and "other error" are interchanged), but can be merged somehow.

You've sold me on AnyError, we should have something like that with the addition of NoteTags. With that said, I feel uneasy about adding it in this patch and not testing it properly. I can also sympathize with your anxiety against bloating the fseek patch further by simply moving the code there, but I'm just not sure there is a good way to avoid it. I have a suggestion:

Remove the code that adds and handles AnyError. Merge the two enums in the StreamState, and make ensureSteamOpened emit a warning if the stream is either feof() or ferror(). Add NoteTags saying something along the lines of:

if (feof(F)) { // note: Assuming the condition is true
               // note: Stream 'F' has the EOF flag set
  fgets(F); // warning: Illegal stream operation on a stream that has EOF set
}

I think that would make this a very neat, self contained patch.

Add the removed code to the fseek patch (preferably with the name UnknownError). Make ensureSteamOpened emit a warning on this enum value as well. I don't think it would bloat the patch too much -- we just simply need that much to make sense of it.

What do you think?

clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
72–84	I meant to merge the two enums (`StreamState` and `ErrorKindTy`) and the fields associated with them (`State` and `ErrorState`). We could however merge `Closed` and `OpenFailed`, granted that we put a `NoteTag` to `evalFreopen`. But I agree, that should be another revision's topic :)
83	Okay, after your comment in the other revision, I see how such a kind should be left, and the bug report should rather be enriched by `NoteTags`. With that said, `AnyError` is still a bit confusing as a name -- how abour `UnknownError`?
90	If a stream's opening failed, its still an error, yet this getter would return true for it. I think this is yet another reason to just merge the 2 enums :)
93	How about we change the name of it to `isUnknownError`? My main issue is that to me, the name `isAnyError()` answeres the question whether the stream is in any form of erroneous state.
369	The `clearerr()` function clears the end-of file and error indicators for the given stream. Yup, you're totally right on this one. My bad.

I do not understand totally this remove-of-AnyError thing. If handling the AnyError will be removed the code remains still not testable. Because none of the possible failing file operations are modeled in this revision. A test with code like if (feof(F)) { can not work because feof(F) is never true without a previously failed operation. (We decide not at the moment of calling feof that feof will return true, it is decided at a previously failed operation, together with the return value of that operation.)

clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
83	I want it to rename to `FEofOrFErrorError`, this tells exactly what it is (`UnknownError` can be still thought as a third error type). For these 2 error types this naming is not too long, and new error types are not likely to appear here later.
90	But as the comment says, the error state is applicable only in the opened state. If not opened, we may not call the "error" functions at all (assertion can be added). Anyway it is possible to merge the enums, then the `isOpened` will be a bit difficult (true if state is one of `OpenedNoError`, `OpenedFErrorError`, `OpenedFEofError`, `OpenedFEofOrFErrorError`). Logically it is better to see that the state is an aggregation of two states, an "openedness" and an error state. Probably this does not matter much here because it is unlikely that new states will appear that make the code more complex.

xazax.hun added inline comments.Mar 11 2020, 3:53 AM

clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
72–84	Since you mentioned that ErrorState is only applicable to Open streams, I am also +1 on merging the enums. These two states are not orthogonal, no reason for them to be separate.
408	As you probably know we are splitting the execution paths here. This will potentially result in doubling the analysis tome for a function for each `feof` call. Since state splits can be really bad for performance we need good justification for each of them. So the questions are: Is it really important to differentiate between eof and other error or is it possible to just have an any error state? Do we find more bugs in real world code that justifies these state splits?

baloghadamsoftware added inline comments.Mar 11 2020, 4:47 AM

clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
72–84	Not orthogonal, but rather hiearchical. That is a reason for not merging. I am completely against it.
82	If would suggest `NoError`, `FEoF`, `FError`, `FEoFOrFError`. Too many `Error` suffixes reduce the readability.
90	I would not merge the two enums: this way they are hierarchical. When only checking the openness we do not need to check for 4 different states. I completely agree with @balazske here.
93	Yes, "Any" usually means any of the possible kinds. But instead of `UnknownError` use `FEoFOrFerror`.
397	I suggest to move codes that are only executed after a new functionality is added in a subsequent patch to that patch.
408	I agree here. Do not introduce such forks without specific reason.
418–419	Maybe using a template with a functor parameter and pass a lambda in the two instantiations? (See `DebugContainerModeling` and `DebugIteratorModeling`) and also some example is `Iterator.cpp`.

balazske marked 2 inline comments as done.Mar 11 2020, 5:14 AM

balazske added inline comments.

clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
408	`feof` and `ferror` should return the same value if called multiple times (and no other file operations in between). If the exact error is not saved in the state, how can we ensure that the calls return the same value?
418–419	I have already a template implementation, but probably it is not needed now.

xazax.hun added inline comments.Mar 11 2020, 5:22 AM

clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
72–84	In a more expressive language each enum value could have parameters and we could have Opened(ErrorKind), Closed, OpenFailed While we do not have such an expressive language, we can simulate this using the current constructs such as a variant. The question is, does this worth the effort? At this point this is more like the matter of taste as long as it is properly documented.

baloghadamsoftware added inline comments.Mar 11 2020, 5:51 AM

clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
82	I suggest to omit the last error state in this patch and introduce it in the patch which uses it.

What is better? Have this (or similar) change that adds a feature that is used in a later change and is only testable in that later change, or have a bigger change that contains real use of the added features? (This means: Add fseek to this change or not.) The error handling is not normally testable if there is no function that generates error state, fseek could be one.

balazske marked 4 inline comments as done.Mar 11 2020, 7:06 AM

balazske added inline comments.

clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
72–84	Have a `bool isOpened` and an error kind is better?

In D75682#1916809, @balazske wrote:

What is better? Have this (or similar) change that adds a feature that is used in a later change and is only testable in that later change, or have a bigger change that contains real use of the added features? (This means: Add fseek to this change or not.) The error handling is not normally testable if there is no function that generates error state, fseek could be one.

I think neither: add the feature (the last enum value and its handling in the functions) in the fseek() patch were it is used.

And why not add first the "unknown error" state and later the feof and ferror error states? Or add every error state but not the feof and ferror functions? Every way results in an intermediate state of the checker.

I see that we're a bit stuck on naming, and that is largely my fault. Apologies, let us focus on high level stuff for a bit.

In D75682#1916435, @balazske wrote:

I do not understand totally this remove-of-AnyError thing. If handling the AnyError will be removed the code remains still not testable.

In D75682#1915809, @Szelethus wrote:
Remove the code that adds and handles AnyError. Merge the two enums in the StreamState, and make ensureSteamOpened emit a warning if the stream is either feof() or ferror(). Add NoteTags saying something along the lines of:
if (feof(F)) { // note: Assuming the condition is true
               // note: Stream 'F' has the EOF flag set
  fgets(F); // warning: Illegal stream operation on a stream that has EOF set
}

If we added warnings to this patch, that would be testable.

In D75682#1916809, @balazske wrote:

What is better? Have this (or similar) change that adds a feature that is used in a later change and is only testable in that later change, or have a bigger change that contains real use of the added features? (This means: Add fseek to this change or not.) The error handling is not normally testable if there is no function that generates error state, fseek could be one.

Definitely the latter, and all you need to do is check whether the stream is in an error state in ensureStreamOpened, no need for fseek just yet.

In D75682#1916867, @baloghadamsoftware wrote:

I think neither: add the feature (the last enum value and its handling in the functions) in the fseek() patch were it is used.

Yup.

clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
72–84	That sounds wonderful, @balazske! :)
83	Okay, sure.

Also, thank you guys for chipping in!

clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
408	This is a very interesting question indeed, never crossed my mind before. I think it would be better to move `AnyError` to the next revision and discuss it there.

baloghadamsoftware added inline comments.Mar 11 2020, 8:05 AM

clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
72–84	Yes, we do not need the `OpenFailed` state either. A stream is either opened or not. This is the state of the stream. If there is any error, there are the error codes for.

Removed AnyError.
Using common code for evalFeof and ferror.

Adding correct diff.

Harbormaster completed remote builds in B48824: Diff 249642.Mar 11 2020, 9:37 AM

Harbormaster completed remote builds in B48825: Diff 249643.

Could you please add the warning we discussed? That would be a great demonstration of this patch's capabilities, and wouldn't deserve its own patch.

In D75682#1917257, @Szelethus wrote:

Could you please add the warning we discussed? That would be a great demonstration of this patch's capabilities, and wouldn't deserve its own patch.

Was it this warning?

if (feof(F)) { // note: Assuming the condition is true
               // note: Stream 'F' has the EOF flag set
  fgets(F); // warning: Illegal stream operation on a stream that has EOF set
}

The checker does not work by "assuming feof is true" (this can be a later feature, if the F is not tracked and a feof call is encountered). Value of feof (more precisely: if any error happened) is fixed when the file operation is called that causes the error. The warning above should be get even if the feof call is missing because it is still possible that the previous operation resulted in EOF state. Adding a similar warning is not a small change and the "infrastructure" for it is not ready in this revision. And still that warning would not be testable because FEof error state is never set in this patch. By the way, a file read is not invalid if any error is there, it simply fails again. But there is many possibility for warnings such as "file read called in EOF state", "feof called when it is proven to be false/true", "invalid file read/write after failed fseek/fread/fwrite" but these belong to next revisions.

Probably it would be better to make a full working prototype first, because the design decisions made now can turn out to be wrong later when the new functionality would be added and we do "not see the full picture" now.

balazske marked 9 inline comments as done.Mar 13 2020, 1:47 AM

balazske added inline comments.

clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
72–84	The "error kind" concept is needed separately from the stream state, see D75851. This is a reason for not merging them.
397	This is not applicable any more, `AnyError` is removed (from here).
408	The state fork problem is solved with `UnknownError` is the later revision, here it is no problem any more. A single state split is always needed because the return values differ in the failed and success case, and I want to make correlation between return values and stream error state.

Thanks for sticking this out! It just takes me a while to get to the mental space regarding this patch you are already in, sometimes through saying incorrect statements or stupid suggestions! :) If others could chip in in this discussion, it would be most appreciated, because I fear that otherwise this patch is a bit too exposed to my own wrongs.

In D75682#1918839, @balazske wrote:
In D75682#1917257, @Szelethus wrote:

Could you please add the warning we discussed? That would be a great demonstration of this patch's capabilities, and wouldn't deserve its own patch.

Was it this warning?
if (feof(F)) { // note: Assuming the condition is true
               // note: Stream 'F' has the EOF flag set
  fgets(F); // warning: Illegal stream operation on a stream that has EOF set
}
The warning above should be get even if the feof call is missing because it is still possible that the previous operation resulted in EOF state.

What I meant to demonstrate (but failed to describe) is that if the condition of the if statement is !feof(F), we shouldn't get the warning.

The checker does not work by "assuming feof is true" (this can be a later feature, if the F is not tracked and a feof call is encountered). Value of feof (more precisely: if any error happened) is fixed when the file operation is called that causes the error. [...] And still that warning would not be testable because FEof error state is never set in this patch.

I suspect that at this point you are 2 steps ahead of me, but this is how I would imagine such a patch to look like: we evaluate feof() by splitting the state (this should be worth the cost, because why would we ever call feof if the state couldn't be EOF or non-EOF?), associate the state where its return value is true with the argument being marked as EOF, and the other as non-EOF.

This obviously wouldn't be sufficient -- values retrieved from stream reading operations may be EOF themselves, which should make us set the appropriate state for the originating stream, but that should indeed be the topic of a followup patch.

What is the goal here if not this? I have a suspicion I'm just not seeing the obvious here, but I don't get it just yet.

By the way, a file read is not invalid if any error is there, it simply fails again.

Oh, yea, silly me. You're totally right :)

Probably it would be better to make a full working prototype first, because the design decisions made now can turn out to be wrong later when the new functionality would be added and we do "not see the full picture" now.

If you try to implement a warning just to see how this would turn out long term, that would indeed be a good idea. Feel free to upload them, if you prefer, as a WIP patch!

clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
399–400	Is this correct? We set the `feof()` return value whether we know the state of the stream to be EOF, but in this case we would incorrectly mark it as non-EOF: $ cat a.txt # contains a single newline $ cat test.cpp #include <cstdio> void tryRead(FILE F) { char C = fgetc(F); if (feof(F)) printf("The stream is EOF!\n"); else printf("The stream is good! Read '%c'\n", C); } int main() { FILE F = fopen("a.txt", "r"); tryRead(F); tryRead(F); // Incorrectly mark F to be non-EOF } $ clang test.cpp -fsanitize=address && ./a.out The stream is good! Read ' ' The stream is EOF! Wouldn't it be safer to assume it to be `true` if we know its EOF and do nothing otherwise? How about a state split? (I know this functions also handler FError, but you see what my point is.)
clang/test/Analysis/stream-error.c
33–34 ↗	(On Diff #249643)	Sure, we don't, but the bigger issue here is that we wouldn't mark `F` to be in EOF even if we did handle it, we would also need to understand that `ch == EOF` is the telling sign. But that also ins't in the scope of this patch :)

balazske marked an inline comment as done.Mar 13 2020, 9:14 AM

balazske added inline comments.

clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
399–400	The problem in this code seems to be that `fgetc` is not handled (yet) by the checker. The plan is to handle every possible file operation. The handler of `fgetc` would split the state to EOF and non-EOF case. In the current state this is a real "bug", it could be handled somehow by checking for escaped stream (the `fgetc` here should cause escape of the file pointer if not recognized as file operation). But this means again a new patch before this one. Or add every file operation to this patch with some simple implementation or with "forgetting" the stream?

It looks like for now there's no way to put the stream into an error state, right?

clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
72–84	Yes, we do not need the `OpenFailed` state either. A stream is either opened or not. This is the state of the stream. If there is any error, there are the error codes for. Nope, there's also the state in which the stream was before it was opened, and it's different from being opened and closed. Most of the time we simply don't track the stream in such state, but when an open fails, we suddenly start tracking it, so we need a separate enum. Like, we'll have to emit different diagnostic messages depending on whether the stream is in `Closed` state or in an `OpenFailed` state and this information should be there at the bug report site, so it's part of the program state.
92	`llvm::function_ref`?
116	Simply `C.getLocationContext()`.
372	So what exactly happens when you're calling `clearerr` on a closed stream? Does it actually become opened? Also what happens when you're calling it on an open-failed stream?
385–387	Maybe we should add a helper function for this, `CallEvent::getOriginCallExpr()`. That's an otherwise ugly function given that `CallEvent` is a class hierarchy and the superclass doesn't need to know about subclasses, but most users do in fact only care about call-expressions and concise checker API is important. Same with `dyn_cast_or_null<FunctionDecl>(Call.getDecl())` that could be written as `Call.getFunctionDecl()`.
394	I recommend not introducing a new symbol when you're going to return 0 anyway. A plain 0 is easier on the constraint solver than a symbol constrained to 0.
clang/test/Analysis/stream-error.c
33–34 ↗	(On Diff #249643)	Yeah, that's gonna be fun.

balazske marked 6 inline comments as done.Mar 16 2020, 2:00 AM

balazske added inline comments.

clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
92	`function_ref`'s documentation says: This class does not own the callable, so it is not in general safe to store a function_ref. The `FnDescription` stores these functions so it is not safe to use `llvm::function_ref`?
372	The stream gets always in opened and error-free state. The `preDefault` checks that the stream is opened (not closed and not open-failed) and creates error node if not. If this create fails that is a problem case, the stream will be opened after `clearerr`. Should check in the eval functions for closed stream even if the pre-functions (in `preCall` callback) have normally checked this?
385–387	Can be done in another revision together with changing every use of `dyn_cast_or_null<CallExpr>(Call.getOriginExpr())` to the new function.
clang/test/Analysis/stream-error.c
33–34 ↗	(On Diff #249643)	If there is a `fputc` call (for example) that is recognized (has recognized stream and no functions "fail") there is already an error and non-error branch created for it. On the error branch the return value of `fputc` would be EOF and the stream state is set to error. (The `fputc` should be later modeled by this checker, not done yet.) If the `ch == EOF` is found then to figure out if this `ch` comes from an `fputc` that is called on a stream that is not recognized (it may be function parameter) and then make a "tracked" stream with error state for it, this is another thing.

NoQ added inline comments.Mar 16 2020, 7:02 AM

clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
92	I think you're using it only for global function pointers, no?
372	Aha, ok, got it. Let's assert it then?
clang/test/Analysis/stream-error.c
33–34 ↗	(On Diff #249643)	Aha, fair enough, that's probably a well-justified state split.

Addressed some comments.

balazske marked 3 inline comments as done.Mar 16 2020, 10:10 AM

balazske added inline comments.

clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
92	Probably can work but I tried it and got compile errors.

Harbormaster completed remote builds in B49327: Diff 250588.Mar 16 2020, 10:24 AM

Riiiight I think I finally get it. You don't want to state split on feof() and ferror(), but rather on the stream operations! This is why you only want to tell the analyzer what the return value of these functions are going to be, because the state of the stream should be set by the time we reach these functions, right?

How about untracked streams? What if we call feof() on a stream we got from a parameter, wouldn't that suggest that the stream is probably non-null and could either be EOF or non-EOF?

clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
92	It doesn't matter a whole lot, we could leave this as-is :)

In D75682#1926732, @Szelethus wrote:

Riiiight I think I finally get it. You don't want to state split on feof() and ferror(), but rather on the stream operations!

Yes the split is at the operations. I did not think even of splitting at start of feof to determine "backwards" the result of the previous operation. This could be another approach. But the way of split depends on the previous operation (to check if error is possible based on possible constraints on its return value), probably not better (but less state split?).

In D75682#1926732, @Szelethus wrote:

How about untracked streams? What if we call feof() on a stream we got from a parameter, wouldn't that suggest that the stream is probably non-null and could either be EOF or non-EOF?

Currently the operation is completely ignored on untracked stream. Could start with a non-null stream and every error case is possible.

Herald added a subscriber: DenisDvlp. · View Herald TranscriptMar 17 2020, 10:00 AM

In D75682#1926889, @balazske wrote:

In D75682#1926732, @Szelethus wrote:

Riiiight I think I finally get it. You don't want to state split on feof() and ferror(), but rather on the stream operations!

Yes the split is at the operations. I did not think even of splitting at start of feof() to determine "backwards" the result of the previous operation. This could be another approach. But the way of split depends on the previous operation (to check if error is possible based on possible constraints on its return value), probably not better (but less state split?).

This is probably why it took a bit for us to understand each other! :) Generally speaking, we try to avoid state splitting as much as possible, but you totally convinced me, this isn't the place where we should be conservative. One should always expect stream operations to potentially fail, and that really does introduce two separate paths of execution.

The main issue that remains is testability, and now that we are on the same page, I see why you were concerned about it. I have a couple ideas:

For streams where the precise state is unknown (they are not tracked), start tracking. If we explicitly check whether a state is foef(), we can rightfully assume both of those possibilities.
Add debug function similar to clang_analyzer_express, like clang_analyzer_set_eof(FILE *), etc.
Evalulate a less complicated stream modeling function that sets such flags, though I suspect the followup patch is supposed to be the one doing this.

What do you think?

Herald added a subscriber: ASDenysPetrov. · View Herald TranscriptMar 18 2020, 5:18 AM

In D75682#1928716, @Szelethus wrote:

For streams where the precise state is unknown (they are not tracked), start tracking. If we explicitly check whether a state is foef(), we can rightfully assume both of those possibilities.

This can follow in a later revision only. The new stream should have every error state possible, so when it is introduced 3 new states are required (with the error types), or an "unknown error" is needed that will be added later.

Add debug function similar to clang_analyzer_express, like clang_analyzer_set_eof(FILE *), etc.

How to add this? Probably it is too much work and I do not like if separate such debug functions are there for "every" checker (or many of).

Evalulate a less complicated stream modeling function that sets such flags, though I suspect the followup patch is supposed to be the one doing this.

The fseek patch can do this.

In D75682#1929093, @balazske wrote:

In D75682#1928716, @Szelethus wrote:

For streams where the precise state is unknown (they are not tracked), start tracking. If we explicitly check whether a state is foef(), we can rightfully assume both of those possibilities.

This can follow in a later revision only. The new stream should have every error state possible, so when it is introduced 3 new states are required (with the error types), or an "unknown error" is needed that will be added later.

Evalulate a less complicated stream modeling function that sets such flags, though I suspect the followup patch is supposed to be the one doing this.

The fseek patch can do this.

Very well.

Add debug function similar to clang_analyzer_express, like clang_analyzer_set_eof(FILE *), etc.

How to add this? Probably it is too much work and I do not like if separate such debug functions are there for "every" checker (or many of).

@steakhal recently made a patch that contained that and not much else: D74131. I share you dislike with the debug function, but I suspect that it wouldn't be too much work, especially compared to my other suggestions that would introduce testability, unless you have some other ideas (to which I'd be open to!).

DenisDvlp removed a subscriber: DenisDvlp.Mar 18 2020, 2:28 PM

baloghadamsoftware added inline comments.Mar 19 2020, 1:10 AM

clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
29	Create -> Creation

Adding special test functions is not as easy, then StreamState should be accessible from another checker. It could be added to the same file, or new file but then moving the data structures into header is needed. At least for short-term it is more simple to add a stream function that generates error state (fseek is applicable or other). This is why the fseek patch is better to be included in this change (but makes it more complicated). Otherwise this revision must be added without tests for state observer functions.

In D75682#1931108, @balazske wrote:

Adding special test functions is not as easy, then StreamState should be accessible from another checker. It could be added to the same file, or new file but then moving the data structures into header is needed.

For the time being, I don't fancy the idea of moving code to a header file.

At least for short-term it is more simple to add a stream function that generates error state (fseek is applicable or other). This is why the fseek patch is better to be included in this change (but makes it more complicated). Otherwise this revision must be added without tests for state observer functions.

It seems like D75356 should have been the revision to finish, after all :^). I agree -- don't worry about bloating the patch a bit, I've spend a lot of time with your patches, I feel confident in my ability at this point to be able to thoroughly review it.

So what is missing or wrong in this change to be accepted?

In D75682#1940901, @balazske wrote:

So what is missing or wrong in this change to be accepted?

This change is functional but is not testable. But the high level idea is great! I think we should maybe merge this with the followup, as you originally intended to.

Also, judging from the fact that how differently we imagined this to go (even though imo your idea is superior to mine), we should explain what our state splitting ideology is in the code.

Added test checker.

Harbormaster failed remote builds in B50954: Diff 253587!Mar 30 2020, 8:03 AM

Ping.
It is now testable, a sub-checker was added for testing (that can be useful at later improvements too).

LGTM, let's go! Thank you so much for sticking this out!

clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
43–52	Can a stream be both `feof` and `ferror`? Let's not delay this patch one moment longer, but a `TODO` to give this some thought might be nice.
clang/test/Analysis/stream-error.c
1 ↗	(On Diff #253587)	`core` isn't enabled! :o

This revision is now accepted and ready to land.Apr 6 2020, 3:39 AM

balazske marked an inline comment as done.Apr 6 2020, 5:45 AM

balazske added inline comments.

clang/test/Analysis/stream-error.c
1 ↗	(On Diff #253587)	Is it needed? It was probably not set (and is not) in the original test in `stream.c`.

Adding comment about ferror, feof.
Enabled core checker in test file.

Harbormaster failed remote builds in B51959: Diff 255360!Apr 6 2020, 9:45 AM

NoQ added inline comments.Apr 6 2020, 11:29 AM

clang/include/clang/StaticAnalyzer/Checkers/Checkers.td
455 ↗	(On Diff #255360)	This should go into the debug package. We don't want users to enable it accidentally by enabling the whole `unix` package.
clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
26–28	`assertStreamStateOpened(SS);` isn't much harder to type. Can we make it a function please?

Szelethus added inline comments.Apr 6 2020, 12:09 PM

clang/include/clang/StaticAnalyzer/Checkers/Checkers.td
455 ↗	(On Diff #255360)	Oh, nice catch!
clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
26–28	Mind that this is completely stripped from release builds, unlike a function call. I think.

NoQ added inline comments.Apr 6 2020, 12:47 PM

clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
26–28	I'd be shocked if that's the case. In LLVM we have a lot of tiny functions all over the place and we'd better have them inlined most of the time.

Szelethus added inline comments.Apr 6 2020, 1:14 PM

clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
26–28	Ah, okay. Lets turn this into a regular function then!

Moved test checker to debug package, changed macro to function.

Harbormaster failed remote builds in B52120: Diff 255602!Apr 7 2020, 12:30 AM

Whoo!

Adding diff to master.

Harbormaster failed remote builds in B52299: Diff 255906!Apr 8 2020, 12:30 AM

Closed by commit rG11bd3e5c6549: [Analyzer][StreamChecker] Introduction of stream error handling. (authored by balazske). · Explain WhyApr 8 2020, 2:40 AM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

clang/

lib/

StaticAnalyzer/

Checkers/

StreamChecker.cpp

145 lines

Diff 250588

clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp

Show All 13 Lines
#include "clang/StaticAnalyzer/Core/BugReporter/BugType.h"		#include "clang/StaticAnalyzer/Core/BugReporter/BugType.h"
#include "clang/StaticAnalyzer/Core/Checker.h"		#include "clang/StaticAnalyzer/Core/Checker.h"
#include "clang/StaticAnalyzer/Core/CheckerManager.h"		#include "clang/StaticAnalyzer/Core/CheckerManager.h"
#include "clang/StaticAnalyzer/Core/PathSensitive/CallEvent.h"		#include "clang/StaticAnalyzer/Core/PathSensitive/CallEvent.h"
#include "clang/StaticAnalyzer/Core/PathSensitive/CheckerContext.h"		#include "clang/StaticAnalyzer/Core/PathSensitive/CheckerContext.h"
#include "clang/StaticAnalyzer/Core/PathSensitive/ProgramState.h"		#include "clang/StaticAnalyzer/Core/PathSensitive/ProgramState.h"
#include "clang/StaticAnalyzer/Core/PathSensitive/ProgramStateTrait.h"		#include "clang/StaticAnalyzer/Core/PathSensitive/ProgramStateTrait.h"
#include "clang/StaticAnalyzer/Core/PathSensitive/SymbolManager.h"		#include "clang/StaticAnalyzer/Core/PathSensitive/SymbolManager.h"
#include <functional>

using namespace clang;		using namespace clang;
using namespace ento;		using namespace ento;
using namespace std::placeholders;		using namespace std::placeholders;

		#define ASSERT_STREAMSTATE_OPENED \
		assert(SS->isOpened() && \
		NoQUnsubmitted Not Done Reply Inline Actions `assertStreamStateOpened(SS);` isn't much harder to type. Can we make it a function please? NoQ: `assertStreamStateOpened(SS);` isn't much harder to type. Can we make it a function please?
		SzelethusUnsubmitted Not Done Reply Inline Actions Mind that this is completely stripped from release builds, unlike a function call. I think. Szelethus: Mind that this is completely stripped from release builds, unlike a function call. I think.
		NoQUnsubmitted Not Done Reply Inline Actions I'd be shocked if that's the case. In LLVM we have a lot of tiny functions all over the place and we'd better have them inlined most of the time. NoQ: I'd be shocked if that's the case. In LLVM we have a lot of tiny functions all over the place…
		SzelethusUnsubmitted Not Done Reply Inline Actions Ah, okay. Lets turn this into a regular function then! Szelethus: Ah, okay. Lets turn this into a regular function then!
		"Create of error node failed (only way to get this state)?");
		baloghadamsoftwareUnsubmitted Not Done Reply Inline Actions Create -> Creation baloghadamsoftware: Create -> Creation

namespace {		namespace {

/// Full state information about a stream pointer.		/// Full state information about a stream pointer.
struct StreamState {		struct StreamState {
/// State of a stream symbol.		/// State of a stream symbol.
		/// FIXME: We need maybe an "escaped" state later.
enum KindTy {		enum KindTy {
Opened, /// Stream is opened.		Opened, /// Stream is opened.
Closed, /// Closed stream (an invalid stream pointer after it was closed).		Closed, /// Closed stream (an invalid stream pointer after it was closed).
OpenFailed /// The last open operation has failed.		OpenFailed /// The last open operation has failed.
} State;		} State;

/// The error state of a stream.		/// The error state of a stream.
		/// Valid only if the stream is opened.
enum ErrorKindTy {		enum ErrorKindTy {
NoError, /// No error flag is set or stream is not open.		/// No error flag is set (or stream is not open).
EofError, /// EOF condition (`feof` is true).		NoError,
OtherError, /// Other (non-EOF) error (`ferror` is true).		/// EOF condition (`feof` is true).
AnyError /// EofError or OtherError, used if exact error is unknown.		FEof,
		/// Other generic (non-EOF) error (`ferror` is true).
		FError,
} ErrorState = NoError;		} ErrorState = NoError;
		SzelethusUnsubmitted Not Done Reply Inline Actions Can a stream be both `feof` and `ferror`? Let's not delay this patch one moment longer, but a `TODO` to give this some thought might be nice. Szelethus: Can a stream be both `feof` and `ferror`? Let's not delay this patch one moment longer, but a…

bool isOpened() const { return State == Opened; }		bool isOpened() const { return State == Opened; }
bool isClosed() const { return State == Closed; }		bool isClosed() const { return State == Closed; }
bool isOpenFailed() const { return State == OpenFailed; }		bool isOpenFailed() const { return State == OpenFailed; }

bool isNoError() const { return ErrorState == NoError; }		bool isNoError() const {
bool isEofError() const { return ErrorState == EofError; }		assert(State == Opened && "Error undefined for closed stream.");
bool isOtherError() const { return ErrorState == OtherError; }		return ErrorState == NoError;
bool isAnyError() const { return ErrorState == AnyError; }		}
		bool isFEof() const {
		assert(State == Opened && "Error undefined for closed stream.");
		return ErrorState == FEof;
		}
		bool isFError() const {
		assert(State == Opened && "Error undefined for closed stream.");
		return ErrorState == FError;
		}

bool operator==(const StreamState &X) const {		bool operator==(const StreamState &X) const {
		// In not opened state error should always NoError.
return State == X.State && ErrorState == X.ErrorState;		return State == X.State && ErrorState == X.ErrorState;
}		}

static StreamState getOpened() { return StreamState{Opened}; }		static StreamState getOpened() { return StreamState{Opened}; }
static StreamState getClosed() { return StreamState{Closed}; }		static StreamState getClosed() { return StreamState{Closed}; }
static StreamState getOpenFailed() { return StreamState{OpenFailed}; }		static StreamState getOpenFailed() { return StreamState{OpenFailed}; }
static StreamState getOpenedWithAnyError() {		static StreamState getOpenedWithFEof() { return StreamState{Opened, FEof}; }
return StreamState{Opened, AnyError};		static StreamState getOpenedWithFError() {
}		return StreamState{Opened, FError};
static StreamState getOpenedWithEofError() {
return StreamState{Opened, EofError};
}
static StreamState getOpenedWithOtherError() {
return StreamState{Opened, OtherError};
}		}
		SzelethusUnsubmitted Not Done Reply Inline Actions Shouldn't we call this `FError` instead, if this is set precisely when `ferror()` is true? Despite the comments, I still managed to get myself confused with it :) Szelethus: Shouldn't we call this `FError` instead, if this is set precisely when `ferror()` is true?
		balazskeAuthorUnsubmitted Done Reply Inline Actions Plan is to rename to `NoError`, `FEofError`, `FErrorError`, `FEofOrFErrorError`. balazske: Plan is to rename to `NoError`, `FEofError`, `FErrorError`, `FEofOrFErrorError`.
		baloghadamsoftwareUnsubmitted Not Done Reply Inline Actions If would suggest `NoError`, `FEoF`, `FError`, `FEoFOrFError`. Too many `Error` suffixes reduce the readability. baloghadamsoftware: If would suggest `NoError`, `FEoF`, `FError`, `FEoFOrFError`. Too many `Error` suffixes reduce…
		baloghadamsoftwareUnsubmitted Not Done Reply Inline Actions I suggest to omit the last error state in this patch and introduce it in the patch which uses it. baloghadamsoftware: I suggest to omit the last error state in this patch and introduce it in the patch which uses…

		SzelethusUnsubmitted Not Done Reply Inline Actions When do we know that the stream is in an error state, but not precisely know what that error is within the context of this patch? `fseek` would indeed introduce such a state, as described in D75356#inline-689287, but that should introduce its own error `ErrorKind`. I still don't really get why we would ever need `AnyError`. Szelethus: When do we know that the stream is in an error state, but not precisely know what that error is…
		balazskeAuthorUnsubmitted Done Reply Inline Actions This is the problem is the change is too small: The next part of it introduces functions where the new error flags are used. In this patch the AnyError feature is not used (the feof and ferror read it but nothing sets it). balazske: This is the problem is the change is too small: The next part of it introduces functions where…
		SzelethusUnsubmitted Not Done Reply Inline Actions Okay, after your comment in the other revision, I see how such a kind should be left, and the bug report should rather be enriched by `NoteTags`. With that said, `AnyError` is still a bit confusing as a name -- how abour `UnknownError`? Szelethus: Okay, after your comment in the other revision, I see how such a kind should be left, and the…
		balazskeAuthorUnsubmitted Done Reply Inline Actions I want it to rename to `FEofOrFErrorError`, this tells exactly what it is (`UnknownError` can be still thought as a third error type). For these 2 error types this naming is not too long, and new error types are not likely to appear here later. balazske: I want it to rename to `FEofOrFErrorError`, this tells exactly what it is (`UnknownError` can…
		SzelethusUnsubmitted Not Done Reply Inline Actions Okay, sure. Szelethus: Okay, sure.
void Profile(llvm::FoldingSetNodeID &ID) const {		void Profile(llvm::FoldingSetNodeID &ID) const {
		SzelethusUnsubmitted Not Done Reply Inline Actions Hmm, now that I think of it, could we just merge these 2 enums? Also, I fear that indexers would accidentally assign the comment to the enum after the comma: Opened, /// Stream is opened. Closed, /// Closed stream (an invalid stream pointer after it was closed). OpenFailed /// The last open operation has failed. `/// Stream is opened` might be assigned to `Closed`. How about this: /// Stream is opened. Opened, /// Closed stream (an invalid stream pointer after it was closed). Closed, /// The last open operation has failed. OpenFailed Szelethus: Hmm, now that I think of it, could we just merge these 2 enums? Also, I fear that indexers…
		balazskeAuthorUnsubmitted Done Reply Inline Actions Probably these can be merged, it is used for a stream that is in an indeterminate state after failed `freopen`, but practically it is handled the same way as a closed stream. But this change would be done in another revision. balazske: Probably these can be merged, it is used for a stream that is in an indeterminate state after…
		SzelethusUnsubmitted Not Done Reply Inline Actions I meant to merge the two enums (`StreamState` and `ErrorKindTy`) and the fields associated with them (`State` and `ErrorState`). We could however merge `Closed` and `OpenFailed`, granted that we put a `NoteTag` to `evalFreopen`. But I agree, that should be another revision's topic :) Szelethus: I meant to merge the two enums (`StreamState` and `ErrorKindTy`) and the fields associated with…
		xazax.hunUnsubmitted Not Done Reply Inline Actions Since you mentioned that ErrorState is only applicable to Open streams, I am also +1 on merging the enums. These two states are not orthogonal, no reason for them to be separate. xazax.hun: Since you mentioned that ErrorState is only applicable to Open streams, I am also +1 on merging…
		baloghadamsoftwareUnsubmitted Not Done Reply Inline Actions Not orthogonal, but rather hiearchical. That is a reason for not merging. I am completely against it. baloghadamsoftware: Not orthogonal, but rather hiearchical. That is a reason for not merging. I am completely…
		xazax.hunUnsubmitted Not Done Reply Inline Actions In a more expressive language each enum value could have parameters and we could have Opened(ErrorKind), Closed, OpenFailed While we do not have such an expressive language, we can simulate this using the current constructs such as a variant. The question is, does this worth the effort? At this point this is more like the matter of taste as long as it is properly documented. xazax.hun: In a more expressive language each enum value could have parameters and we could have ```…
		balazskeAuthorUnsubmitted Done Reply Inline Actions Have a `bool isOpened` and an error kind is better? balazske: Have a `bool isOpened` and an error kind is better?
		SzelethusUnsubmitted Not Done Reply Inline Actions That sounds wonderful, @balazske! :) Szelethus: That sounds wonderful, @balazske! :)
		baloghadamsoftwareUnsubmitted Not Done Reply Inline Actions Yes, we do not need the `OpenFailed` state either. A stream is either opened or not. This is the state of the stream. If there is any error, there are the error codes for. baloghadamsoftware: Yes, we do not need the `OpenFailed` state either. A stream is either opened or not. This is…
		NoQUnsubmitted Not Done Reply Inline Actions Yes, we do not need the `OpenFailed` state either. A stream is either opened or not. This is the state of the stream. If there is any error, there are the error codes for. Nope, there's also the state in which the stream was before it was opened, and it's different from being opened and closed. Most of the time we simply don't track the stream in such state, but when an open fails, we suddenly start tracking it, so we need a separate enum. Like, we'll have to emit different diagnostic messages depending on whether the stream is in `Closed` state or in an `OpenFailed` state and this information should be there at the bug report site, so it's part of the program state. NoQ: > Yes, we do not need the `OpenFailed` state either. A stream is either opened or not. This is…
		balazskeAuthorUnsubmitted Done Reply Inline Actions The "error kind" concept is needed separately from the stream state, see D75851. This is a reason for not merging them. balazske: The "error kind" concept is needed separately from the stream state, see D75851. This is a…
ID.AddInteger(State);		ID.AddInteger(State);
ID.AddInteger(ErrorState);		ID.AddInteger(ErrorState);
}		}
};		};

class StreamChecker;		class StreamChecker;
		SzelethusUnsubmitted Not Done Reply Inline Actions If a stream's opening failed, its still an error, yet this getter would return true for it. I think this is yet another reason to just merge the 2 enums :) Szelethus: If a stream's opening failed, its still an error, yet this getter would return true for it. I…
		balazskeAuthorUnsubmitted Done Reply Inline Actions But as the comment says, the error state is applicable only in the opened state. If not opened, we may not call the "error" functions at all (assertion can be added). Anyway it is possible to merge the enums, then the `isOpened` will be a bit difficult (true if state is one of `OpenedNoError`, `OpenedFErrorError`, `OpenedFEofError`, `OpenedFEofOrFErrorError`). Logically it is better to see that the state is an aggregation of two states, an "openedness" and an error state. Probably this does not matter much here because it is unlikely that new states will appear that make the code more complex. balazske: But as the comment says, the error state is applicable only in the opened state. If not opened…
		baloghadamsoftwareUnsubmitted Not Done Reply Inline Actions I would not merge the two enums: this way they are hierarchical. When only checking the openness we do not need to check for 4 different states. I completely agree with @balazske here. baloghadamsoftware: I would not merge the two enums: this way they are hierarchical. When only checking the…
struct FnDescription;		struct FnDescription;
using FnCheck = std::function<void(const StreamChecker , const FnDescription ,		using FnCheck = std::function<void(const StreamChecker , const FnDescription ,
		SzelethusUnsubmitted Not Done Reply Inline Actions Same, shouldn't this be `isFError()`? Szelethus: Same, shouldn't this be `isFError()`?
		NoQUnsubmitted Done Reply Inline Actions `llvm::function_ref`? NoQ: `llvm::function_ref`?
		balazskeAuthorUnsubmitted Done Reply Inline Actions `function_ref`'s documentation says: This class does not own the callable, so it is not in general safe to store a function_ref. The `FnDescription` stores these functions so it is not safe to use `llvm::function_ref`? balazske: `function_ref`'s documentation says: > This class does not own the callable, so it is not in…
		NoQUnsubmitted Not Done Reply Inline Actions I think you're using it only for global function pointers, no? NoQ: I think you're using it only for global function pointers, no?
		balazskeAuthorUnsubmitted Done Reply Inline Actions Probably can work but I tried it and got compile errors. balazske: Probably can work but I tried it and got compile errors.
		SzelethusUnsubmitted Not Done Reply Inline Actions It doesn't matter a whole lot, we could leave this as-is :) Szelethus: It doesn't matter a whole lot, we could leave this as-is :)
const CallEvent &, CheckerContext &)>;		const CallEvent &, CheckerContext &)>;
		SzelethusUnsubmitted Not Done Reply Inline Actions This is confusing -- I would expect a method called `isAnyError()` to return true when `isNoError()` is false. Szelethus: This is confusing -- I would expect a method called `isAnyError()` to return true when…
		balazskeAuthorUnsubmitted Done Reply Inline Actions The error state kinds are not mutually exclusive. The "any error" means we know that there is some error but we do not know what the error is (it is not relevant because nothing was done that depends on it). If a function is called that needs to know if "ferror" or "feof" is the error, the state will be split to `FErrorError` and `FEofError`. balazske: The error state kinds are not mutually exclusive. The "any error" means we know that there is…
		SzelethusUnsubmitted Not Done Reply Inline Actions How about we change the name of it to `isUnknownError`? My main issue is that to me, the name `isAnyError()` answeres the question whether the stream is in any form of erroneous state. Szelethus: How about we change the name of it to `isUnknownError`? My main issue is that to me, the name…
		baloghadamsoftwareUnsubmitted Not Done Reply Inline Actions Yes, "Any" usually means any of the possible kinds. But instead of `UnknownError` use `FEoFOrFerror`. baloghadamsoftware: Yes, "Any" usually means //any// of the possible kinds. But instead of `UnknownError` use…

using ArgNoTy = unsigned int;		using ArgNoTy = unsigned int;
static const ArgNoTy ArgNone = std::numeric_limits<ArgNoTy>::max();		static const ArgNoTy ArgNone = std::numeric_limits<ArgNoTy>::max();

struct FnDescription {		struct FnDescription {
FnCheck PreFn;		FnCheck PreFn;
FnCheck EvalFn;		FnCheck EvalFn;
ArgNoTy StreamArgNo;		ArgNoTy StreamArgNo;
};		};

/// Get the value of the stream argument out of the passed call event.		/// Get the value of the stream argument out of the passed call event.
/// The call should contain a function that is described by Desc.		/// The call should contain a function that is described by Desc.
SVal getStreamArg(const FnDescription *Desc, const CallEvent &Call) {		SVal getStreamArg(const FnDescription *Desc, const CallEvent &Call) {
assert(Desc && Desc->StreamArgNo != ArgNone &&		assert(Desc && Desc->StreamArgNo != ArgNone &&
"Try to get a non-existing stream argument.");		"Try to get a non-existing stream argument.");
return Call.getArgSVal(Desc->StreamArgNo);		return Call.getArgSVal(Desc->StreamArgNo);
}		}

/// Create a conjured symbol return value for a call expression.		/// Create a conjured symbol return value for a call expression.
DefinedSVal makeRetVal(CheckerContext &C, const CallExpr *CE) {		DefinedSVal makeRetVal(CheckerContext &C, const CallExpr *CE) {
assert(CE && "Expecting a call expression.");		assert(CE && "Expecting a call expression.");

const LocationContext *LCtx = C.getPredecessor()->getLocationContext();		const LocationContext *LCtx = C.getLocationContext();
		NoQUnsubmitted Done Reply Inline Actions Simply `C.getLocationContext()`. NoQ: Simply `C.getLocationContext()`.
return C.getSValBuilder()		return C.getSValBuilder()
.conjureSymbolVal(nullptr, CE, LCtx, C.blockCount())		.conjureSymbolVal(nullptr, CE, LCtx, C.blockCount())
.castAs<DefinedSVal>();		.castAs<DefinedSVal>();
}		}

class StreamChecker		class StreamChecker
: public Checker<check::PreCall, eval::Call, check::DeadSymbols> {		: public Checker<check::PreCall, eval::Call, check::DeadSymbols> {
mutable std::unique_ptr<BuiltinBug> BT_nullfp, BT_illegalwhence,		mutable std::unique_ptr<BuiltinBug> BT_nullfp, BT_illegalwhence,
Show All 16 Lines	CallDescriptionMap<FnDescription> FnDescriptions = {
{{"fwrite", 4}, {&StreamChecker::preDefault, nullptr, 3}},		{{"fwrite", 4}, {&StreamChecker::preDefault, nullptr, 3}},
{{"fseek", 3}, {&StreamChecker::preFseek, nullptr, 0}},		{{"fseek", 3}, {&StreamChecker::preFseek, nullptr, 0}},
{{"ftell", 1}, {&StreamChecker::preDefault, nullptr, 0}},		{{"ftell", 1}, {&StreamChecker::preDefault, nullptr, 0}},
{{"rewind", 1}, {&StreamChecker::preDefault, nullptr, 0}},		{{"rewind", 1}, {&StreamChecker::preDefault, nullptr, 0}},
{{"fgetpos", 2}, {&StreamChecker::preDefault, nullptr, 0}},		{{"fgetpos", 2}, {&StreamChecker::preDefault, nullptr, 0}},
{{"fsetpos", 2}, {&StreamChecker::preDefault, nullptr, 0}},		{{"fsetpos", 2}, {&StreamChecker::preDefault, nullptr, 0}},
{{"clearerr", 1},		{{"clearerr", 1},
{&StreamChecker::preDefault, &StreamChecker::evalClearerr, 0}},		{&StreamChecker::preDefault, &StreamChecker::evalClearerr, 0}},
{{"feof", 1}, {&StreamChecker::preDefault, &StreamChecker::evalFeof, 0}},		{{"feof", 1},
		{&StreamChecker::preDefault,
		&StreamChecker::evalFeofFerror<&StreamState::isFEof>, 0}},
{{"ferror", 1},		{{"ferror", 1},
{&StreamChecker::preDefault, &StreamChecker::evalFerror, 0}},		{&StreamChecker::preDefault,
		&StreamChecker::evalFeofFerror<&StreamState::isFError>, 0}},
{{"fileno", 1}, {&StreamChecker::preDefault, nullptr, 0}},		{{"fileno", 1}, {&StreamChecker::preDefault, nullptr, 0}},
};		};

void evalFopen(const FnDescription *Desc, const CallEvent &Call,		void evalFopen(const FnDescription *Desc, const CallEvent &Call,
CheckerContext &C) const;		CheckerContext &C) const;

void preFreopen(const FnDescription *Desc, const CallEvent &Call,		void preFreopen(const FnDescription *Desc, const CallEvent &Call,
CheckerContext &C) const;		CheckerContext &C) const;
void evalFreopen(const FnDescription *Desc, const CallEvent &Call,		void evalFreopen(const FnDescription *Desc, const CallEvent &Call,
CheckerContext &C) const;		CheckerContext &C) const;

void evalFclose(const FnDescription *Desc, const CallEvent &Call,		void evalFclose(const FnDescription *Desc, const CallEvent &Call,
CheckerContext &C) const;		CheckerContext &C) const;

void preFseek(const FnDescription *Desc, const CallEvent &Call,		void preFseek(const FnDescription *Desc, const CallEvent &Call,
CheckerContext &C) const;		CheckerContext &C) const;

void preDefault(const FnDescription *Desc, const CallEvent &Call,		void preDefault(const FnDescription *Desc, const CallEvent &Call,
CheckerContext &C) const;		CheckerContext &C) const;

void evalClearerr(const FnDescription *Desc, const CallEvent &Call,		void evalClearerr(const FnDescription *Desc, const CallEvent &Call,
CheckerContext &C) const;		CheckerContext &C) const;

void evalFeof(const FnDescription *Desc, const CallEvent &Call,		template <bool (StreamState::*IsOfError)() const>
CheckerContext &C) const;		void evalFeofFerror(const FnDescription *Desc, const CallEvent &Call,

void evalFerror(const FnDescription *Desc, const CallEvent &Call,
CheckerContext &C) const;		CheckerContext &C) const;

/// Check that the stream (in StreamVal) is not NULL.		/// Check that the stream (in StreamVal) is not NULL.
/// If it can only be NULL a fatal error is emitted and nullptr returned.		/// If it can only be NULL a fatal error is emitted and nullptr returned.
/// Otherwise the return value is a new state where the stream is constrained		/// Otherwise the return value is a new state where the stream is constrained
/// to be non-null.		/// to be non-null.
ProgramStateRef ensureStreamNonNull(SVal StreamVal, CheckerContext &C,		ProgramStateRef ensureStreamNonNull(SVal StreamVal, CheckerContext &C,
ProgramStateRef State) const;		ProgramStateRef State) const;

▲ Show 20 Lines • Show All 133 Lines • ▼ Show 20 Lines	void StreamChecker::evalFclose(const FnDescription *Desc, const CallEvent &Call,
SymbolRef Sym = getStreamArg(Desc, Call).getAsSymbol();		SymbolRef Sym = getStreamArg(Desc, Call).getAsSymbol();
if (!Sym)		if (!Sym)
return;		return;

const StreamState *SS = State->get<StreamMap>(Sym);		const StreamState *SS = State->get<StreamMap>(Sym);
if (!SS)		if (!SS)
return;		return;

		ASSERT_STREAMSTATE_OPENED;

// Close the File Descriptor.		// Close the File Descriptor.
// Regardless if the close fails or not, stream becomes "closed"		// Regardless if the close fails or not, stream becomes "closed"
// and can not be used any more.		// and can not be used any more.
State = State->set<StreamMap>(Sym, StreamState::getClosed());		State = State->set<StreamMap>(Sym, StreamState::getClosed());

C.addTransition(State);		C.addTransition(State);
}		}

Show All 21 Lines	void StreamChecker::evalClearerr(const FnDescription *Desc,
SymbolRef StreamSym = getStreamArg(Desc, Call).getAsSymbol();		SymbolRef StreamSym = getStreamArg(Desc, Call).getAsSymbol();
if (!StreamSym)		if (!StreamSym)
return;		return;

const StreamState *SS = State->get<StreamMap>(StreamSym);		const StreamState *SS = State->get<StreamMap>(StreamSym);
if (!SS)		if (!SS)
return;		return;

		ASSERT_STREAMSTATE_OPENED;
		SzelethusUnsubmitted Done Reply Inline Actions What if we call `clearerr()` on a stream that is in an `feof()` state? Shouln't we return if the stream is `!isOtherError()` (or `!isFError()`, if we were to rename it)? Szelethus: What if we call `clearerr()` on a stream that is in an `feof()` state? Shouln't we return if…
		balazskeAuthorUnsubmitted Done Reply Inline Actions `clearerr` does not return any value. It only clears the error flags (sets to false). In my interpretation the stream has one error value associated with it, this value may be EOF or "other" error, or no error. There are not two error flags, only one kind of error that may be EOF or other. balazske: `clearerr` does not return any value. It only clears the error flags (sets to false). In my…
		SzelethusUnsubmitted Done Reply Inline Actions The `clearerr()` function clears the end-of file and error indicators for the given stream. Yup, you're totally right on this one. My bad. Szelethus: > The `clearerr()` function clears the end-of file and error indicators for the given stream.

if (SS->isNoError())		if (SS->isNoError())
return;		return;
		NoQUnsubmitted Done Reply Inline Actions So what exactly happens when you're calling `clearerr` on a closed stream? Does it actually become opened? Also what happens when you're calling it on an open-failed stream? NoQ: So what exactly happens when you're calling `clearerr` on a closed stream? Does it actually…
		balazskeAuthorUnsubmitted Done Reply Inline Actions The stream gets always in opened and error-free state. The `preDefault` checks that the stream is opened (not closed and not open-failed) and creates error node if not. If this create fails that is a problem case, the stream will be opened after `clearerr`. Should check in the eval functions for closed stream even if the pre-functions (in `preCall` callback) have normally checked this? balazske: The stream gets always in opened and error-free state. The `preDefault` checks that the stream…
		NoQUnsubmitted Not Done Reply Inline Actions Aha, ok, got it. Let's assert it then? NoQ: Aha, ok, got it. Let's assert it then?

State = State->set<StreamMap>(StreamSym, StreamState::getOpened());		State = State->set<StreamMap>(StreamSym, StreamState::getOpened());
C.addTransition(State);		C.addTransition(State);
}		}

void StreamChecker::evalFeof(const FnDescription *Desc, const CallEvent &Call,		template <bool (StreamState::*IsOfError)() const>
		void StreamChecker::evalFeofFerror(const FnDescription *Desc,
		const CallEvent &Call,
CheckerContext &C) const {		CheckerContext &C) const {
ProgramStateRef State = C.getState();		ProgramStateRef State = C.getState();
SymbolRef StreamSym = getStreamArg(Desc, Call).getAsSymbol();		SymbolRef StreamSym = getStreamArg(Desc, Call).getAsSymbol();
if (!StreamSym)		if (!StreamSym)
return;		return;

const CallExpr *CE = dyn_cast_or_null<CallExpr>(Call.getOriginExpr());		const CallExpr *CE = dyn_cast_or_null<CallExpr>(Call.getOriginExpr());
		NoQUnsubmitted Not Done Reply Inline Actions Maybe we should add a helper function for this, `CallEvent::getOriginCallExpr()`. That's an otherwise ugly function given that `CallEvent` is a class hierarchy and the superclass doesn't need to know about subclasses, but most users do in fact only care about call-expressions and concise checker API is important. Same with `dyn_cast_or_null<FunctionDecl>(Call.getDecl())` that could be written as `Call.getFunctionDecl()`. NoQ: Maybe we should add a helper function for this, `CallEvent::getOriginCallExpr()`. That's an…
		balazskeAuthorUnsubmitted Done Reply Inline Actions Can be done in another revision together with changing every use of `dyn_cast_or_null<CallExpr>(Call.getOriginExpr())` to the new function. balazske: Can be done in another revision together with changing every use of `dyn_cast_or_null<CallExpr>…
if (!CE)		if (!CE)
return;		return;

const StreamState *SS = State->get<StreamMap>(StreamSym);		const StreamState *SS = State->get<StreamMap>(StreamSym);
// Ignore the call if the stream is not tracked.		// Ignore the call if the stream is not tracked.
if (!SS)		if (!SS)
return;		return;
		NoQUnsubmitted Done Reply Inline Actions I recommend not introducing a new symbol when you're going to return 0 anyway. A plain 0 is easier on the constraint solver than a symbol constrained to 0. NoQ: I recommend not introducing a new symbol when you're going to return 0 anyway. A plain 0 is…

DefinedSVal RetVal = makeRetVal(C, CE);		ASSERT_STREAMSTATE_OPENED;

// Make expression result.
State = State->BindExpr(CE, C.getLocationContext(), RetVal);

		SzelethusUnsubmitted Done Reply Inline Actions Does this ever run? We never actually set the stream state to `AnyError`. Szelethus: Does this ever run? We never actually set the stream state to `AnyError`.
		baloghadamsoftwareUnsubmitted Done Reply Inline Actions I suggest to move codes that are only executed after a new functionality is added in a subsequent patch to that patch. baloghadamsoftware: I suggest to move codes that are only executed after a new functionality is added in a…
		balazskeAuthorUnsubmitted Done Reply Inline Actions This is not applicable any more, `AnyError` is removed (from here). balazske: This is not applicable any more, `AnyError` is removed (from here).
if (SS->isAnyError()) {		if ((SS->*IsOfError)()) {
// We do not know yet what kind of error is set.		// Function returns nonzero.
// Differentiate between EOF and other error.
ProgramStateRef StateEof, StateOther;
std::tie(StateEof, StateOther) = State->assume(RetVal);
assert(StateEof && StateOther &&
"Return value should not be constrained already.");
StateEof = StateEof->set<StreamMap>(StreamSym,
StreamState::getOpenedWithEofError());
StateOther = StateOther->set<StreamMap>(
StreamSym, StreamState::getOpenedWithOtherError());
C.addTransition(StateEof);
C.addTransition(StateOther);
} else {
// We know what error is set, make the return value accordingly.
State = State->assume(RetVal, SS->isEofError());
assert(State && "Return value should not be constrained already.");
C.addTransition(State);
}
}

void StreamChecker::evalFerror(const FnDescription *Desc, const CallEvent &Call,
CheckerContext &C) const {
ProgramStateRef State = C.getState();
SymbolRef StreamSym = getStreamArg(Desc, Call).getAsSymbol();
if (!StreamSym)
return;

const CallExpr *CE = dyn_cast_or_null<CallExpr>(Call.getOriginExpr());
if (!CE)
return;

const StreamState *SS = State->get<StreamMap>(StreamSym);
// Ignore the call if the stream is not tracked.
if (!SS)
return;

// Make expression result.
DefinedSVal RetVal = makeRetVal(C, CE);		DefinedSVal RetVal = makeRetVal(C, CE);
		SzelethusUnsubmitted Not Done Reply Inline Actions Is this correct? We set the `feof()` return value whether we know the state of the stream to be EOF, but in this case we would incorrectly mark it as non-EOF: $ cat a.txt # contains a single newline $ cat test.cpp #include <cstdio> void tryRead(FILE F) { char C = fgetc(F); if (feof(F)) printf("The stream is EOF!\n"); else printf("The stream is good! Read '%c'\n", C); } int main() { FILE F = fopen("a.txt", "r"); tryRead(F); tryRead(F); // Incorrectly mark F to be non-EOF } $ clang test.cpp -fsanitize=address && ./a.out The stream is good! Read ' ' The stream is EOF! Wouldn't it be safer to assume it to be `true` if we know its EOF and do nothing otherwise? How about a state split? (I know this functions also handler FError, but you see what my point is.) Szelethus: Is this correct? We set the `feof()` return value whether we know the state of the stream to be…
		balazskeAuthorUnsubmitted Done Reply Inline Actions The problem in this code seems to be that `fgetc` is not handled (yet) by the checker. The plan is to handle every possible file operation. The handler of `fgetc` would split the state to EOF and non-EOF case. In the current state this is a real "bug", it could be handled somehow by checking for escaped stream (the `fgetc` here should cause escape of the file pointer if not recognized as file operation). But this means again a new patch before this one. Or add every file operation to this patch with some simple implementation or with "forgetting" the stream? balazske: The problem in this code seems to be that `fgetc` is not handled (yet) by the checker. The plan…
State = State->BindExpr(CE, C.getLocationContext(), RetVal);		State = State->BindExpr(CE, C.getLocationContext(), RetVal);
		State = State->assume(RetVal, true);
if (SS->isAnyError()) {		assert(State && "Assumption on new value should not fail.");
// We do not know yet what kind of error is set.
// Differentiate between EOF and other error.
ProgramStateRef StateEof, StateOther;
std::tie(StateOther, StateEof) = State->assume(RetVal);
assert(StateEof && StateOther &&
"Return value should not be constrained already.");
StateEof = StateEof->set<StreamMap>(StreamSym,
StreamState::getOpenedWithEofError());
StateOther = StateOther->set<StreamMap>(
StreamSym, StreamState::getOpenedWithOtherError());
C.addTransition(StateEof);
C.addTransition(StateOther);
} else {		} else {
// We know what error is set, make the return value accordingly.		// Return zero.
State = State->assume(RetVal, SS->isOtherError());		State = State->BindExpr(CE, C.getLocationContext(),
assert(State && "Return value should not be constrained already.");		C.getSValBuilder().makeIntVal(0, false));
C.addTransition(State);
}		}
		xazax.hunUnsubmitted Done Reply Inline Actions As you probably know we are splitting the execution paths here. This will potentially result in doubling the analysis tome for a function for each `feof` call. Since state splits can be really bad for performance we need good justification for each of them. So the questions are: Is it really important to differentiate between eof and other error or is it possible to just have an any error state? Do we find more bugs in real world code that justifies these state splits? xazax.hun: As you probably know we are splitting the execution paths here. This will potentially result in…
		baloghadamsoftwareUnsubmitted Done Reply Inline Actions I agree here. Do not introduce such forks without specific reason. baloghadamsoftware: I agree here. Do not introduce such forks without specific reason.
		balazskeAuthorUnsubmitted Done Reply Inline Actions `feof` and `ferror` should return the same value if called multiple times (and no other file operations in between). If the exact error is not saved in the state, how can we ensure that the calls return the same value? balazske: `feof` and `ferror` should return the same value if called multiple times (and no other file…
		SzelethusUnsubmitted Done Reply Inline Actions This is a very interesting question indeed, never crossed my mind before. I think it would be better to move `AnyError` to the next revision and discuss it there. Szelethus: This is a very interesting question indeed, never crossed my mind before. I think it would be…
		balazskeAuthorUnsubmitted Done Reply Inline Actions The state fork problem is solved with `UnknownError` is the later revision, here it is no problem any more. A single state split is always needed because the return values differ in the failed and success case, and I want to make correlation between return values and stream error state. balazske: The state fork problem is solved with `UnknownError` is the later revision, here it is no…
		C.addTransition(State);
}		}

void StreamChecker::preDefault(const FnDescription *Desc, const CallEvent &Call,		void StreamChecker::preDefault(const FnDescription *Desc, const CallEvent &Call,
CheckerContext &C) const {		CheckerContext &C) const {
ProgramStateRef State = C.getState();		ProgramStateRef State = C.getState();
SVal StreamVal = getStreamArg(Desc, Call);		SVal StreamVal = getStreamArg(Desc, Call);
State = ensureStreamNonNull(StreamVal, C, State);		State = ensureStreamNonNull(StreamVal, C, State);
if (!State)		if (!State)
return;		return;
State = ensureStreamOpened(StreamVal, C, State);		State = ensureStreamOpened(StreamVal, C, State);
		SzelethusUnsubmitted Done Reply Inline Actions This function is practically the same as `evalFeof`, can we merge them? Szelethus: This function is practically the same as `evalFeof`, can we merge them?
		balazskeAuthorUnsubmitted Done Reply Inline Actions It is not the same code ("EOF" and "other error" are interchanged), but can be merged somehow. balazske: It is not the same code ("EOF" and "other error" are interchanged), but can be merged somehow.
		baloghadamsoftwareUnsubmitted Done Reply Inline Actions Maybe using a template with a functor parameter and pass a lambda in the two instantiations? (See `DebugContainerModeling` and `DebugIteratorModeling`) and also some example is `Iterator.cpp`. baloghadamsoftware: Maybe using a template with a functor parameter and pass a lambda in the two instantiations?
		balazskeAuthorUnsubmitted Done Reply Inline Actions I have already a template implementation, but probably it is not needed now. balazske: I have already a template implementation, but probably it is not needed now.
if (!State)		if (!State)
return;		return;

C.addTransition(State);		C.addTransition(State);
}		}

ProgramStateRef		ProgramStateRef
StreamChecker::ensureStreamNonNull(SVal StreamVal, CheckerContext &C,		StreamChecker::ensureStreamNonNull(SVal StreamVal, CheckerContext &C,
▲ Show 20 Lines • Show All 130 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[Analyzer][StreamChecker] Introduction of stream error handling.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 250588

clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp

[Analyzer][StreamChecker] Introduction of stream error handling.
ClosedPublic