This is an archive of the discontinued LLVM Phabricator instance.

[Analyzer][StreamChecker] Added check for "indeterminate file position".
ClosedPublic

Authored by balazske on May 15 2020, 9:23 AM.

Details

Summary

According to the standard, after a wread or fwrite call the file position
becomes "indeterminate". It is assumable that a next read or write causes
undefined behavior, so a (fatal error) warning is added for this case.
The indeterminate position can be cleared by some operations, for example
fseek or freopen, not with clearerr.

Diff Detail

Event Timeline

balazske created this revision.May 15 2020, 9:23 AM
Szelethus added a subscriber: NoQ.May 21 2020, 3:02 AM

I have more questions than actual objections, but here we go! The patch looks nice overall, we just need to iron a few things out ahead of time.

clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
124

What does this mean? "An EOF+indeterminate state is the same as EOF state." I don't understand the message you want to convey here -- is it that we cannot have an indeterminate file position indicator if we hit EOF, hence we regard a stream that is known to be EOF to have its file position indicator determinate?

A good followup question for the uninitiated would be that "Well why is it ever legal to construct a StreamState object that can both have the FilePositionIndeterminate set to true and the ErrorState indicate that the steam is known to be in EOF?" Well, the answer is that we may only realize later that the error state can only be EOF, like in here:

void f() {
 FILE *F = fopen(...);
 if (fseek(F, ...)) {
    // Could be either EOF, ERROR, and ofc indeterminate
    if (eof(F)) {
      // This is where we have a seemingly impossible stream state, but its not a programming error, its a design decision.
    }
}

This might warrant a bit on explanation either here, or in ensureNoFilePositionIndeterminate. Probably the latter.

With that said, can SteamState's constructor ensure that we do not create a known to be EOF stream that is indeterminate?

131

Let's not cheap out on this parameter name :^)

198–201

In time, we could create a new checker for each of these bug types, similar to how D77866 does it. But this is clearly a low prio issue, I'm just talking aloud.

313

Ooooor ensureFilePositionDeterminate? :D

845–846

I think we shouldn't say its ignored, but rather its impossible. I mean, if we have reached the end of the file, how could the file position indicator be indeterminate?

860

Good question. It would make sense... @NoQ, @xazax.hun?

863–874

Ugh, tough one. I think this just isn't really *the* error to highlight here, but rather that its bad practice to not check on the stream after an operation. But I'm willing to accept I might be wrong here.

clang/test/Analysis/stream-error.c
182

Here the user checked whether F is in eof or in ferror, and its in neither. Isn't it reasonable to assume then that the stream is fine?

Szelethus added inline comments.May 21 2020, 3:04 AM
clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
124

Actually, not enforcing this could leave to false positives:

void f() {
 FILE *F = fopen(...);
 if (fseek(F, ...)) {
    // Could be either EOF, ERROR, and ofc indeterminate
    if (eof(F)) {
      clearerr(F);
      fseek(F, ...); // false positive warning
    }
}
Szelethus added inline comments.May 21 2020, 3:29 AM
clang/test/Analysis/stream-error.c
182

Oh wow I meant to delete this inline. Please disregard.

balazske marked 4 inline comments as done.May 21 2020, 4:08 AM
balazske added inline comments.
clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
124

The comment wants to say only that if the ErrorState contains the ErrorFEof the value of filePositionIndeterminate is to be ignored for the EOF case. If the file is in EOF it does not matter what value filePositionIndeterminate has. The cause for this handling is that ErrorState handles multiple possible errors together but the indeterminate position does not apply to all. If ErrorState contains ErrorFEof and ErrorFError together and the filePositionIndeterminate is set, the position is not indeterminate in the EOF case. For EOF case we should know that the position is at the end of the file, not indeterminate.

Another solution for this problem can be to have a "ErrorFErrorIndeterminate" and "ErrorNoneIndeterminate" error type but this makes things more difficult to handle.

313

It is better to call "Invalid" or "Unknown" position, "indeterminate" is taken from the text of the C standard. I think "indeterminate" is a special name here that is better to have in this form always.

595

This is a place where FilePositionIndeterminate is set in the state. We do not know if FEOF or FERROR will happen. If it is FERROR the position is indeterminate, if it is FEOF the position is not indeterminate.

845–846

The previous comment tells why it is set. The FEOF case is represented in the data structure with indeterminate flag set or not set. Both mean the same state of the modeled stream: FEOF and not indeterminate.

Szelethus marked an inline comment as done.May 25 2020, 7:13 AM
Szelethus added inline comments.
clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
124

What do you mean under the term "the EOF case"? When we know the stream to only be in the EOF state? The overall modeling seems correct, its just that little corner case that if we know that the stream hit EOF, the file position must be determinate.

313

Sure, I'm sold.

balazske marked an inline comment as done.May 26 2020, 12:30 AM
balazske added inline comments.
clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
124

The "difficult" case is when we do not know if the stream is in error or EOF state. We know only that it is in one of these. And the FilePositionIndeterminate is set to true. In this state FEof and FError are true in ErrorState. If the program should know what the error is, it needs to generate 2 new states, one with only FEof true and other with FError. And how to set the FilePositionIndeterminate for the new states? For FError it must be set to true because it was true before, for FEof it is set to false because by definition it is not applied to FEof case. (This last is the EOF case above.)

This table shows what (real) stream state is represented (values at right) by what values in StreamState (values at left side):

FEofFErrorFilePositionIndeterminatestream feof?stream ferror?position indeterminate?
falsefalsetruefalsefalsetrue
truefalsetruetruefalsefalse
falsetruetruefalsetruetrue
truetruetrue---
Szelethus added inline comments.May 26 2020, 2:48 AM
clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
124

Right. How about we make this field private and create a an isFilePositionIndeterminate() method that ensures that for the second row of that table we always return false? That would take some responsibility off of this interface's users.

847–848

This branch could be removed in its entirety.

860

Well, looking at other checkers, they usually early return and don't modify the state if the creation of the error node failed, so a nullptr return would be more appropriate.

863–874

Actually, I take this back. If we had NoteTags to explain that "this stream operation failed and left the stream file position indication in an indeterminate state", this would be great.

balazske marked an inline comment as done.May 26 2020, 8:06 AM
balazske added inline comments.
clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
863–874

Adding of NoteTag and other bug path improvements are planned in the next changes. (The message text could mention that error check was probably forgotten. But the faulty function can be called despite the error check was done.)

Szelethus added inline comments.May 26 2020, 12:20 PM
clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
863–874

Cool! Proceed as its most convenient, no need to rush it while the checker is in alpha.

balazske updated this revision to Diff 266434.May 26 2020, 11:52 PM
  • Rebase
  • Some StreamState members are const and checked in constructor
  • Restructuring of some functions.
balazske updated this revision to Diff 266440.May 27 2020, 12:29 AM
balazske marked 2 inline comments as done.
  • Fixed a bug in previous update
  • Added new tests
balazske marked 10 inline comments as done.May 27 2020, 12:32 AM

LGTM, I just have one question in the inlines.

clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
111–114

Cool! Though it makes me wish we had a constructor that just constructs a stream in EOF that doesn't require a file position argument, because this will ensure that the state is correct, but its still the user of this interface who is responsible for making it so.

With that said, feel free to leave it as-is, if there ever emerges a desire for such an interface, we can make that happen anytime.

670–673

But what if I inspect errno and conclude that the file position is intact? Is there such a thing? After a call to stream operation that might leave the file position indicator indeterminate and getting ferror, am I supposed to fseek or reopen the file no matter what?

balazske marked an inline comment as done.May 27 2020, 2:47 AM
balazske added inline comments.
clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
670–673

There is not much information about what errno values mean an indeterminate file position (and what errno values are possible at all after a specific operation). Probably we can assume that only position reset (fseek, fsetpos, rewind) and fflush fixes the wrong state. Many of stream handling code anyway aborts the operation if error happens.

Szelethus accepted this revision.May 27 2020, 5:24 AM

LGTM! This patch is great! I think you're doing a great job with this project.

clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
670–673

Yeah, this makes sense. We should double check the standard in time though.

This revision is now accepted and ready to land.May 27 2020, 5:24 AM
This revision was automatically updated to reflect the committed changes.