This is an archive of the discontinued LLVM Phabricator instance.

[Analyzer][StreamChecker] Introduction of stream error handling.
ClosedPublic

Authored by balazske on Mar 5 2020, 7:29 AM.

Details

Summary

A new field is added to the stream state to store the actual error.
Function fseek is implemented to set the error state.
(More are to be added later.)
Modeling of stream error state handling functions is added.

(clearerr is not done yet but should follow.)

Diff Detail

Event Timeline

balazske created this revision.Mar 5 2020, 7:29 AM
Herald added a project: Restricted Project. · View Herald Transcript

For older discussion see this:
https://reviews.llvm.org/D75356

Functions feof and ferror are added to show how the stream error flags are used and to make tests possible.

Some explanation of the error states:

  • EofError indicates an EOF condition in the stream.
  • OtherError indicates a generic (I/O or other but not EOF) error.
  • AnyError is a "placeholder" if the exact error kind is not important. This is a "Schrödinger's cat" type of state: If we need to observe the exact error it can change to Eof or Other error. (See code in evalFeof and evalFerror, probably this is the only place for this event.) This error kind is used to save state splits, other solution is to make for example after fseek a new state for EofError and another for OtherError. The file error functions are relatively seldomly used (?), because checking success of an operation can be done with the return value and there is no need to check ferror or feof. And for analyzer warnings it is often enough to know if any error happened, not if exactly EOF or other error.

We can not make a warning if a stream operation is called in failed state: The error was probably handled based on the return value of the previous failed operation. This thing is checked by the "ErrorReturnChecker" (https://reviews.llvm.org/D72705). Only some special functions like getc are to be handled specially where the error state has importance.

Szelethus requested changes to this revision.Mar 6 2020, 6:47 AM

Let's halt this for now -- we have so many revisions going on seemingly doing the same thing, its becoming really confusing. Please finish D75356 before proceeding.

This revision now requires changes to proceed.Mar 6 2020, 6:47 AM
balazske updated this revision to Diff 249085.Mar 9 2020, 6:48 AM

Removed support for fseek, added clearerr.
This is now error handling only without fseek.
No functions set the error flags so test for
clearerr is not possible.

balazske updated this revision to Diff 249107.Mar 9 2020, 8:23 AM

Fixed lint warnings.

Can we see more test cases for when after a call to feof we indeed can tell the stream is in an EOF state? Same for ferror. I feel like there is a lot of code we don't yet cover.

Also, I see now that I didn't get in earlier revisions that OtherError actually refers to ferror() -- sorry about the confusion :)

clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
68–80

Hmm, now that I think of it, could we just merge these 2 enums? Also, I fear that indexers would accidentally assign the comment to the enum after the comma:

Opened, /// Stream is opened.
Closed, /// Closed stream (an invalid stream pointer after it was closed).
OpenFailed /// The last open operation has failed.

/// Stream is opened might be assigned to Closed. How about this:

/// Stream is opened.
Opened,
/// Closed stream (an invalid stream pointer after it was closed).
Closed,
/// The last open operation has failed.
OpenFailed
78

Shouldn't we call this FError instead, if this is set precisely when ferror() is true? Despite the comments, I still managed to get myself confused with it :)

79

When do we know that the stream is in an error state, but not precisely know what that error is within the context of this patch? fseek would indeed introduce such a state, as described in D75356#inline-689287, but that should introduce its own error ErrorKind. I still don't really get why we would ever need AnyError.

88

Same, shouldn't this be isFError()?

89

This is confusing -- I would expect a method called isAnyError() to return true when isNoError() is false.

389

What if we call clearerr() on a stream that is in an feof() state? Shouln't we return if the stream is !isOtherError() (or !isFError(), if we were to rename it)?

417

Does this ever run? We never actually set the stream state to AnyError.

438–439

This function is practically the same as evalFeof, can we merge them?

Szelethus requested changes to this revision.Mar 10 2020, 6:52 AM
This revision now requires changes to proceed.Mar 10 2020, 6:52 AM
balazske marked 6 inline comments as done.Mar 10 2020, 8:34 AM

The problem here with tests is that there is no function to call that sets the error flags FEof or FError. So it is only possible to check if these are not set after opening the file.

clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
68–80

Probably these can be merged, it is used for a stream that is in an indeterminate state after failed freopen, but practically it is handled the same way as a closed stream. But this change would be done in another revision.

78

Plan is to rename to NoError, FEofError, FErrorError, FEofOrFErrorError.

79

This is the problem is the change is too small: The next part of it introduces functions where the new error flags are used. In this patch the AnyError feature is not used (the feof and ferror read it but nothing sets it).

89

The error state kinds are not mutually exclusive. The "any error" means we know that there is some error but we do not know what the error is (it is not relevant because nothing was done that depends on it). If a function is called that needs to know if "ferror" or "feof" is the error, the state will be split to FErrorError and FEofError.

389

clearerr does not return any value. It only clears the error flags (sets to false). In my interpretation the stream has one error value associated with it, this value may be EOF or "other" error, or no error. There are not two error flags, only one kind of error that may be EOF or other.

438–439

It is not the same code ("EOF" and "other error" are interchanged), but can be merged somehow.

You've sold me on AnyError, we should have something like that with the addition of NoteTags. With that said, I feel uneasy about adding it in this patch and not testing it properly. I can also sympathize with your anxiety against bloating the fseek patch further by simply moving the code there, but I'm just not sure there is a good way to avoid it. I have a suggestion:

  1. Remove the code that adds and handles AnyError. Merge the two enums in the StreamState, and make ensureSteamOpened emit a warning if the stream is either feof() or ferror(). Add NoteTags saying something along the lines of:
if (feof(F)) { // note: Assuming the condition is true
               // note: Stream 'F' has the EOF flag set
  fgets(F); // warning: Illegal stream operation on a stream that has EOF set
}

I think that would make this a very neat, self contained patch.

  1. Add the removed code to the fseek patch (preferably with the name UnknownError). Make ensureSteamOpened emit a warning on this enum value as well. I don't think it would bloat the patch too much -- we just simply need that much to make sense of it.

What do you think?

clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
68–80

I meant to merge the two enums (StreamState and ErrorKindTy) and the fields associated with them (State and ErrorState). We could however merge Closed and OpenFailed, granted that we put a NoteTag to evalFreopen. But I agree, that should be another revision's topic :)

79

Okay, after your comment in the other revision, I see how such a kind should be left, and the bug report should rather be enriched by NoteTags. With that said, AnyError is still a bit confusing as a name -- how abour UnknownError?

86

If a stream's opening failed, its still an error, yet this getter would return true for it. I think this is yet another reason to just merge the 2 enums :)

89

How about we change the name of it to isUnknownError? My main issue is that to me, the name isAnyError() answeres the question whether the stream is in any form of erroneous state.

389

The clearerr() function clears the end-of file and error indicators for the given stream.

Yup, you're totally right on this one. My bad.

balazske marked 2 inline comments as done.Mar 11 2020, 2:27 AM

I do not understand totally this remove-of-AnyError thing. If handling the AnyError will be removed the code remains still not testable. Because none of the possible failing file operations are modeled in this revision. A test with code like if (feof(F)) { can not work because feof(F) is never true without a previously failed operation. (We decide not at the moment of calling feof that feof will return true, it is decided at a previously failed operation, together with the return value of that operation.)

clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
79

I want it to rename to FEofOrFErrorError, this tells exactly what it is (UnknownError can be still thought as a third error type). For these 2 error types this naming is not too long, and new error types are not likely to appear here later.

86

But as the comment says, the error state is applicable only in the opened state. If not opened, we may not call the "error" functions at all (assertion can be added). Anyway it is possible to merge the enums, then the isOpened will be a bit difficult (true if state is one of OpenedNoError, OpenedFErrorError, OpenedFEofError, OpenedFEofOrFErrorError). Logically it is better to see that the state is an aggregation of two states, an "openedness" and an error state. Probably this does not matter much here because it is unlikely that new states will appear that make the code more complex.

xazax.hun added inline comments.Mar 11 2020, 3:53 AM
clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
68–80

Since you mentioned that ErrorState is only applicable to Open streams, I am also +1 on merging the enums. These two states are not orthogonal, no reason for them to be separate.

428

As you probably know we are splitting the execution paths here. This will potentially result in doubling the analysis tome for a function for each feof call. Since state splits can be really bad for performance we need good justification for each of them.
So the questions are:

  • Is it really important to differentiate between eof and other error or is it possible to just have an any error state?
  • Do we find more bugs in real world code that justifies these state splits?
clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
68–80

Not orthogonal, but rather hiearchical. That is a reason for not merging. I am completely against it.

78

If would suggest NoError, FEoF, FError, FEoFOrFError. Too many Error suffixes reduce the readability.

86

I would not merge the two enums: this way they are hierarchical. When only checking the openness we do not need to check for 4 different states. I completely agree with @balazske here.

89

Yes, "Any" usually means any of the possible kinds. But instead of UnknownError use FEoFOrFerror.

417

I suggest to move codes that are only executed after a new functionality is added in a subsequent patch to that patch.

428

I agree here. Do not introduce such forks without specific reason.

438–439

Maybe using a template with a functor parameter and pass a lambda in the two instantiations? (See DebugContainerModeling and DebugIteratorModeling) and also some example is Iterator.cpp.

balazske marked 2 inline comments as done.Mar 11 2020, 5:14 AM
balazske added inline comments.
clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
428

feof and ferror should return the same value if called multiple times (and no other file operations in between). If the exact error is not saved in the state, how can we ensure that the calls return the same value?

438–439

I have already a template implementation, but probably it is not needed now.

xazax.hun added inline comments.Mar 11 2020, 5:22 AM
clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
68–80

In a more expressive language each enum value could have parameters and we could have

Opened(ErrorKind),
Closed,
OpenFailed

While we do not have such an expressive language, we can simulate this using the current constructs such as a variant. The question is, does this worth the effort? At this point this is more like the matter of taste as long as it is properly documented.

clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
78

I suggest to omit the last error state in this patch and introduce it in the patch which uses it.

What is better? Have this (or similar) change that adds a feature that is used in a later change and is only testable in that later change, or have a bigger change that contains real use of the added features? (This means: Add fseek to this change or not.) The error handling is not normally testable if there is no function that generates error state, fseek could be one.

balazske marked 4 inline comments as done.Mar 11 2020, 7:06 AM
balazske added inline comments.
clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
68–80

Have a bool isOpened and an error kind is better?

What is better? Have this (or similar) change that adds a feature that is used in a later change and is only testable in that later change, or have a bigger change that contains real use of the added features? (This means: Add fseek to this change or not.) The error handling is not normally testable if there is no function that generates error state, fseek could be one.

I think neither: add the feature (the last enum value and its handling in the functions) in the fseek() patch were it is used.

And why not add first the "unknown error" state and later the feof and ferror error states? Or add every error state but not the feof and ferror functions? Every way results in an intermediate state of the checker.

I see that we're a bit stuck on naming, and that is largely my fault. Apologies, let us focus on high level stuff for a bit.

I do not understand totally this remove-of-AnyError thing. If handling the AnyError will be removed the code remains still not testable.

Remove the code that adds and handles AnyError. Merge the two enums in the StreamState, and make ensureSteamOpened emit a warning if the stream is either feof() or ferror(). Add NoteTags saying something along the lines of:

if (feof(F)) { // note: Assuming the condition is true
               // note: Stream 'F' has the EOF flag set
  fgets(F); // warning: Illegal stream operation on a stream that has EOF set
}

If we added warnings to this patch, that would be testable.

What is better? Have this (or similar) change that adds a feature that is used in a later change and is only testable in that later change, or have a bigger change that contains real use of the added features? (This means: Add fseek to this change or not.) The error handling is not normally testable if there is no function that generates error state, fseek could be one.

Definitely the latter, and all you need to do is check whether the stream is in an error state in ensureStreamOpened, no need for fseek just yet.

I think neither: add the feature (the last enum value and its handling in the functions) in the fseek() patch were it is used.

Yup.

clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
68–80

That sounds wonderful, @balazske! :)

79

Okay, sure.

Also, thank you guys for chipping in!

clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
428

This is a very interesting question indeed, never crossed my mind before. I think it would be better to move AnyError to the next revision and discuss it there.

clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
68–80

Yes, we do not need the OpenFailed state either. A stream is either opened or not. This is the state of the stream. If there is any error, there are the error codes for.

balazske updated this revision to Diff 249642.Mar 11 2020, 8:42 AM
  • Removed AnyError.
  • Using common code for evalFeof and ferror.
balazske updated this revision to Diff 249643.Mar 11 2020, 8:45 AM

Adding correct diff.

Harbormaster completed remote builds in B48825: Diff 249643.

Could you please add the warning we discussed? That would be a great demonstration of this patch's capabilities, and wouldn't deserve its own patch.

Could you please add the warning we discussed? That would be a great demonstration of this patch's capabilities, and wouldn't deserve its own patch.

Was it this warning?

if (feof(F)) { // note: Assuming the condition is true
               // note: Stream 'F' has the EOF flag set
  fgets(F); // warning: Illegal stream operation on a stream that has EOF set
}

The checker does not work by "assuming feof is true" (this can be a later feature, if the F is not tracked and a feof call is encountered). Value of feof (more precisely: if any error happened) is fixed when the file operation is called that causes the error. The warning above should be get even if the feof call is missing because it is still possible that the previous operation resulted in EOF state. Adding a similar warning is not a small change and the "infrastructure" for it is not ready in this revision. And still that warning would not be testable because FEof error state is never set in this patch. By the way, a file read is not invalid if any error is there, it simply fails again. But there is many possibility for warnings such as "file read called in EOF state", "feof called when it is proven to be false/true", "invalid file read/write after failed fseek/fread/fwrite" but these belong to next revisions.

Probably it would be better to make a full working prototype first, because the design decisions made now can turn out to be wrong later when the new functionality would be added and we do "not see the full picture" now.

balazske marked 9 inline comments as done.Mar 13 2020, 1:47 AM
balazske added inline comments.
clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
68–80

The "error kind" concept is needed separately from the stream state, see D75851. This is a reason for not merging them.

417

This is not applicable any more, AnyError is removed (from here).

428

The state fork problem is solved with UnknownError is the later revision, here it is no problem any more. A single state split is always needed because the return values differ in the failed and success case, and I want to make correlation between return values and stream error state.

Thanks for sticking this out! It just takes me a while to get to the mental space regarding this patch you are already in, sometimes through saying incorrect statements or stupid suggestions! :) If others could chip in in this discussion, it would be most appreciated, because I fear that otherwise this patch is a bit too exposed to my own wrongs.

Could you please add the warning we discussed? That would be a great demonstration of this patch's capabilities, and wouldn't deserve its own patch.

Was it this warning?

if (feof(F)) { // note: Assuming the condition is true
               // note: Stream 'F' has the EOF flag set
  fgets(F); // warning: Illegal stream operation on a stream that has EOF set
}

The warning above should be get even if the feof call is missing because it is still possible that the previous operation resulted in EOF state.

What I meant to demonstrate (but failed to describe) is that if the condition of the if statement is !feof(F), we shouldn't get the warning.

The checker does not work by "assuming feof is true" (this can be a later feature, if the F is not tracked and a feof call is encountered). Value of feof (more precisely: if any error happened) is fixed when the file operation is called that causes the error. [...] And still that warning would not be testable because FEof error state is never set in this patch.

I suspect that at this point you are 2 steps ahead of me, but this is how I would imagine such a patch to look like: we evaluate feof() by splitting the state (this should be worth the cost, because why would we ever call feof if the state couldn't be EOF or non-EOF?), associate the state where its return value is true with the argument being marked as EOF, and the other as non-EOF.

This obviously wouldn't be sufficient -- values retrieved from stream reading operations may be EOF themselves, which should make us set the appropriate state for the originating stream, but that should indeed be the topic of a followup patch.

What is the goal here if not this? I have a suspicion I'm just not seeing the obvious here, but I don't get it just yet.

By the way, a file read is not invalid if any error is there, it simply fails again.

Oh, yea, silly me. You're totally right :)

Probably it would be better to make a full working prototype first, because the design decisions made now can turn out to be wrong later when the new functionality would be added and we do "not see the full picture" now.

If you try to implement a warning just to see how this would turn out long term, that would indeed be a good idea. Feel free to upload them, if you prefer, as a WIP patch!

clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
419–420

Is this correct? We set the feof() return value whether we know the state of the stream to be EOF, but in this case we would incorrectly mark it as non-EOF:

$ cat a.txt # contains a single newline

$ cat test.cpp
#include <cstdio>

void tryRead(FILE *F) {
  char C = fgetc(F);

  if (feof(F))
    printf("The stream is EOF!\n");
  else
    printf("The stream is good! Read '%c'\n", C);
}

int main() {
  FILE *F = fopen("a.txt", "r");
  tryRead(F);
  tryRead(F); // Incorrectly mark F to be non-EOF
}
$ clang test.cpp -fsanitize=address && ./a.out 
The stream is good! Read '
'
The stream is EOF!

Wouldn't it be safer to assume it to be true if we know its EOF and do nothing otherwise? How about a state split?

(I know this functions also handler FError, but you see what my point is.)

clang/test/Analysis/stream-error.c
33–34

Sure, we don't, but the bigger issue here is that we wouldn't mark F to be in EOF even if we did handle it, we would also need to understand that ch == EOF is the telling sign. But that also ins't in the scope of this patch :)

balazske marked an inline comment as done.Mar 13 2020, 9:14 AM
balazske added inline comments.
clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
419–420

The problem in this code seems to be that fgetc is not handled (yet) by the checker. The plan is to handle every possible file operation. The handler of fgetc would split the state to EOF and non-EOF case. In the current state this is a real "bug", it could be handled somehow by checking for escaped stream (the fgetc here should cause escape of the file pointer if not recognized as file operation). But this means again a new patch before this one. Or add every file operation to this patch with some simple implementation or with "forgetting" the stream?

NoQ added a comment.Mar 15 2020, 7:54 PM

It looks like for now there's no way to put the stream into an error state, right?

clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
68–80

Yes, we do not need the OpenFailed state either. A stream is either opened or not. This is the state of the stream. If there is any error, there are the error codes for.

Nope, there's also the state in which the stream was before it was opened, and it's different from being opened and closed. Most of the time we simply don't track the stream in such state, but when an open fails, we suddenly start tracking it, so we need a separate enum. Like, we'll have to emit different diagnostic messages depending on whether the stream is in Closed state or in an OpenFailed state and this information should be there at the bug report site, so it's part of the program state.

88

llvm::function_ref?

112

Simply C.getLocationContext().

392

So what exactly happens when you're calling clearerr on a closed stream? Does it actually become opened? Also what happens when you're calling it on an open-failed stream?

405–407

Maybe we should add a helper function for this, CallEvent::getOriginCallExpr(). That's an otherwise ugly function given that CallEvent is a class hierarchy and the superclass doesn't need to know about subclasses, but most users do in fact only care about call-expressions and concise checker API is important. Same with dyn_cast_or_null<FunctionDecl>(Call.getDecl()) that could be written as Call.getFunctionDecl().

414

I recommend not introducing a new symbol when you're going to return 0 anyway. A plain 0 is easier on the constraint solver than a symbol constrained to 0.

clang/test/Analysis/stream-error.c
33–34

Yeah, that's gonna be fun.

balazske marked 6 inline comments as done.Mar 16 2020, 2:00 AM
balazske added inline comments.
clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
88

function_ref's documentation says:

This class does not own the callable, so it is not in general safe to store a function_ref.

The FnDescription stores these functions so it is not safe to use llvm::function_ref?

392

The stream gets always in opened and error-free state. The preDefault checks that the stream is opened (not closed and not open-failed) and creates error node if not. If this create fails that is a problem case, the stream will be opened after clearerr. Should check in the eval functions for closed stream even if the pre-functions (in preCall callback) have normally checked this?

405–407

Can be done in another revision together with changing every use of dyn_cast_or_null<CallExpr>(Call.getOriginExpr()) to the new function.

clang/test/Analysis/stream-error.c
33–34

If there is a fputc call (for example) that is recognized (has recognized stream and no functions "fail") there is already an error and non-error branch created for it. On the error branch the return value of fputc would be EOF and the stream state is set to error. (The fputc should be later modeled by this checker, not done yet.)

If the ch == EOF is found then to figure out if this ch comes from an fputc that is called on a stream that is not recognized (it may be function parameter) and then make a "tracked" stream with error state for it, this is another thing.

NoQ added inline comments.Mar 16 2020, 7:02 AM
clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
88

I think you're using it only for global function pointers, no?

392

Aha, ok, got it.
Let's assert it then?

clang/test/Analysis/stream-error.c
33–34

Aha, fair enough, that's probably a well-justified state split.

balazske updated this revision to Diff 250588.Mar 16 2020, 10:07 AM
balazske marked 2 inline comments as done.

Addressed some comments.

balazske marked 3 inline comments as done.Mar 16 2020, 10:10 AM
balazske added inline comments.
clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
88

Probably can work but I tried it and got compile errors.

Riiiight I think I finally get it. You don't want to state split on feof() and ferror(), but rather on the stream operations! This is why you only want to tell the analyzer what the return value of these functions are going to be, because the state of the stream should be set by the time we reach these functions, right?

How about untracked streams? What if we call feof() on a stream we got from a parameter, wouldn't that suggest that the stream is probably non-null and could either be EOF or non-EOF?

clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
88

It doesn't matter a whole lot, we could leave this as-is :)

Riiiight I think I finally get it. You don't want to state split on feof() and ferror(), but rather on the stream operations!

Yes the split is at the operations. I did not think even of splitting at start of feof to determine "backwards" the result of the previous operation. This could be another approach. But the way of split depends on the previous operation (to check if error is possible based on possible constraints on its return value), probably not better (but less state split?).

How about untracked streams? What if we call feof() on a stream we got from a parameter, wouldn't that suggest that the stream is probably non-null and could either be EOF or non-EOF?

Currently the operation is completely ignored on untracked stream. Could start with a non-null stream and every error case is possible.

Riiiight I think I finally get it. You don't want to state split on feof() and ferror(), but rather on the stream operations!

Yes the split is at the operations. I did not think even of splitting at start of feof() to determine "backwards" the result of the previous operation. This could be another approach. But the way of split depends on the previous operation (to check if error is possible based on possible constraints on its return value), probably not better (but less state split?).

This is probably why it took a bit for us to understand each other! :) Generally speaking, we try to avoid state splitting as much as possible, but you totally convinced me, this isn't the place where we should be conservative. One should always expect stream operations to potentially fail, and that really does introduce two separate paths of execution.

The main issue that remains is testability, and now that we are on the same page, I see why you were concerned about it. I have a couple ideas:

  • For streams where the precise state is unknown (they are not tracked), start tracking. If we explicitly check whether a state is foef(), we can rightfully assume both of those possibilities.
  • Add debug function similar to clang_analyzer_express, like clang_analyzer_set_eof(FILE *), etc.
  • Evalulate a less complicated stream modeling function that sets such flags, though I suspect the followup patch is supposed to be the one doing this.

What do you think?

  • For streams where the precise state is unknown (they are not tracked), start tracking. If we explicitly check whether a state is foef(), we can rightfully assume both of those possibilities.

This can follow in a later revision only. The new stream should have every error state possible, so when it is introduced 3 new states are required (with the error types), or an "unknown error" is needed that will be added later.

  • Add debug function similar to clang_analyzer_express, like clang_analyzer_set_eof(FILE *), etc.

How to add this? Probably it is too much work and I do not like if separate such debug functions are there for "every" checker (or many of).

  • Evalulate a less complicated stream modeling function that sets such flags, though I suspect the followup patch is supposed to be the one doing this.

The fseek patch can do this.

  • For streams where the precise state is unknown (they are not tracked), start tracking. If we explicitly check whether a state is foef(), we can rightfully assume both of those possibilities.

This can follow in a later revision only. The new stream should have every error state possible, so when it is introduced 3 new states are required (with the error types), or an "unknown error" is needed that will be added later.

  • Evalulate a less complicated stream modeling function that sets such flags, though I suspect the followup patch is supposed to be the one doing this.

The fseek patch can do this.

Very well.

  • Add debug function similar to clang_analyzer_express, like clang_analyzer_set_eof(FILE *), etc.

How to add this? Probably it is too much work and I do not like if separate such debug functions are there for "every" checker (or many of).

@steakhal recently made a patch that contained that and not much else: D74131. I share you dislike with the debug function, but I suspect that it wouldn't be too much work, especially compared to my other suggestions that would introduce testability, unless you have some other ideas (to which I'd be open to!).

clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
28

Create -> Creation

Adding special test functions is not as easy, then StreamState should be accessible from another checker. It could be added to the same file, or new file but then moving the data structures into header is needed. At least for short-term it is more simple to add a stream function that generates error state (fseek is applicable or other). This is why the fseek patch is better to be included in this change (but makes it more complicated). Otherwise this revision must be added without tests for state observer functions.

Adding special test functions is not as easy, then StreamState should be accessible from another checker. It could be added to the same file, or new file but then moving the data structures into header is needed.

For the time being, I don't fancy the idea of moving code to a header file.

At least for short-term it is more simple to add a stream function that generates error state (fseek is applicable or other). This is why the fseek patch is better to be included in this change (but makes it more complicated). Otherwise this revision must be added without tests for state observer functions.

It seems like D75356 should have been the revision to finish, after all :^). I agree -- don't worry about bloating the patch a bit, I've spend a lot of time with your patches, I feel confident in my ability at this point to be able to thoroughly review it.

So what is missing or wrong in this change to be accepted?

So what is missing or wrong in this change to be accepted?

This change is functional but is not testable. But the high level idea is great! I think we should maybe merge this with the followup, as you originally intended to.

Also, judging from the fact that how differently we imagined this to go (even though imo your idea is superior to mine), we should explain what our state splitting ideology is in the code.

balazske updated this revision to Diff 253587.Mar 30 2020, 7:06 AM

Added test checker.

Ping.
It is now testable, a sub-checker was added for testing (that can be useful at later improvements too).

Szelethus accepted this revision.Apr 6 2020, 3:39 AM

LGTM, let's go! Thank you so much for sticking this out!

clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
38–48

Can a stream be both feof and ferror? Let's not delay this patch one moment longer, but a TODO to give this some thought might be nice.

clang/test/Analysis/stream-error.c
1

core isn't enabled! :o

This revision is now accepted and ready to land.Apr 6 2020, 3:39 AM
balazske marked an inline comment as done.Apr 6 2020, 5:45 AM
balazske added inline comments.
clang/test/Analysis/stream-error.c
1

Is it needed? It was probably not set (and is not) in the original test in stream.c.

balazske updated this revision to Diff 255360.Apr 6 2020, 8:39 AM
  • Adding comment about ferror, feof.
  • Enabled core checker in test file.
NoQ added inline comments.Apr 6 2020, 11:29 AM
clang/include/clang/StaticAnalyzer/Checkers/Checkers.td
455

This should go into the debug package. We don't want users to enable it accidentally by enabling the whole unix package.

clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
25–27

assertStreamStateOpened(SS); isn't much harder to type. Can we make it a function please?

Szelethus added inline comments.Apr 6 2020, 12:09 PM
clang/include/clang/StaticAnalyzer/Checkers/Checkers.td
455

Oh, nice catch!

clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
25–27

Mind that this is completely stripped from release builds, unlike a function call. I think.

NoQ added inline comments.Apr 6 2020, 12:47 PM
clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
25–27

I'd be shocked if that's the case. In LLVM we have a lot of tiny functions all over the place and we'd better have them inlined most of the time.

Szelethus added inline comments.Apr 6 2020, 1:14 PM
clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
25–27

Ah, okay. Lets turn this into a regular function then!

balazske updated this revision to Diff 255602.Apr 6 2020, 11:58 PM
balazske marked an inline comment as done.

Moved test checker to debug package, changed macro to function.

Szelethus accepted this revision.Apr 7 2020, 5:22 AM

Whoo!

balazske updated this revision to Diff 255906.Apr 7 2020, 11:51 PM

Adding diff to master.

This revision was automatically updated to reflect the committed changes.