This is an archive of the discontinued LLVM Phabricator instance.

[Analyzer][StreamChecker] Added evaluation of fseek.
ClosedPublic

Authored by balazske on Mar 9 2020, 8:42 AM.

Details

Summary

Function fseek is now evaluated with setting error return value
and error flags.

Diff Detail

Event Timeline

balazske created this revision.Mar 9 2020, 8:42 AM
Herald added a project: Restricted Project. · View Herald Transcript
Szelethus requested changes to this revision.Mar 10 2020, 7:39 AM

We should totally dedicate an error kind of fseek, see (and please respond, if you could to) D75356#inline-689287.

clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
451–453

This seems a bit excessive, we could merge the last two into FSeekError.

This revision now requires changes to proceed.Mar 10 2020, 7:39 AM
balazske marked an inline comment as done.Mar 10 2020, 8:52 AM

I do not like to make difference between error from fseek and error from any other function that may fail. AnyError can be used to handle every case. After a failed file operation we do not know if it was feof or ferror kind of error. fseek is only special because it can fail additionally without feof or ferror. If the kind of the last operation needs to be known later, it can be saved in the stream state.

clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
451–453

There are 3 cases:

  • fseek did not fail at all. Return value is zero. This is StateNotFailed.
  • fseek failed but none of the error flags is true afterwards. Return value is nonzero but feof and ferror are not true. This is StateFailedWithoutFError.
  • fseek failed and we have feof or ferror set (never both). Return value is nonzero and feof or ferror will be true. This is StateFailedWithFError. And an use of AnyError, otherwise we need here 2 states, one for feof and one for ferror. But it is not important here if feof or ferror is actually true, so the special value AnyError is used and only one new state instead of two.

I do not like to make difference between error from fseek and error from any other function that may fail. AnyError can be used to handle every case.

Okay, it took me a while, but I think I understand where you're coming from. Lets preserve this enum. I think NoteTags would be a better fit to pinpoint which function left the stream in a state that should be checked before usage. This patch is quite small, could we add one then?

After a failed file operation we do not know if it was feof or ferror kind of error.

Is this true for that many stream operations?

clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
451–453

Sure, but from the point of the analyzer the latter two are the same, isn't it? Its never a good idea to use a stream on which fseek failed without checking.

State splits are the most expensive things the analyzer can make, which is why I'm cautious here.

After a failed file operation we do not know if it was feof or ferror kind of error.

Is this true for that many stream operations?

I looked it up, and it totally is.

Probably we need to clarify when to make warnings. Originally I do not wanted a warning after for example failed read, because it is possible to retry the operation. But the standard says that after failed fread the file position is indeterminate, so a new read may get indeterminate values that is a case for warning. The character I/O functions may work not this way so after failed fgetc it may be possible to retry it and no warning is needed. Some operations such as fclose and fseek can be called in error (including EOF) state.

For this to work we need more kinds of error states, or save the type of the last operation in the state (and use it to determine if warning is needed). Probably the best is to have only a single "failed" state and determine what kind of error is possible based on the saved last operation type. A "feof" and "ferror" kind of error state is still needed because after a call to feof if it returned true the error type is fixed to EOF error.

clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
451–453

Somehow, this should be done with just two new states. Maybe there should be an error state "Unknown" instead of OtherError (or FeoFOrFError what I suggested at the other patch) which can be any of the errors plus the NoError value. Later, when ferror() of feof() is checked we can do a second state split, but not earlier.

balazske marked an inline comment as done.Mar 11 2020, 8:17 AM
balazske added inline comments.
clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
451–453

Yes it is possible to use two new states only. For example by having a new FSeekError that can "manifest" in feof, ferror or no error. Better is to have only one "last-operation-failed-with-any-error" state and determine later if needed what errors are possible (based on the last operation). For fseek it is possible to get feof, ferror, or none of these (but still failed fseek). For reading operations the error can be feof or ferror. After write only ferror is possible.

balazske updated this revision to Diff 249904.Mar 12 2020, 5:05 AM

Rebased, added handling of UnknownError.

NoQ added inline comments.Mar 15 2020, 8:30 PM
clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
536–537

Please explain the high-level idea behind this code.

balazske marked an inline comment as done.Mar 17 2020, 12:34 AM
balazske added inline comments.
clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
536–537

The higher level idea is that if a stream function fails we do not create a new state for every type of error (EOF and "other" error). Instead there will be an "unknown error" state. The description for each stream function contains what errors are possible after that function (PossibleErrors).

If it is needed to know the exact error (like here, what should feof return?) we look at the previous function to see what errors are possible after it. If EOF is not possible at all, the feof returns false. If EOF is possible and only one other type of error, we make a state split with EOF error and the other error set. If EOF and two possible other errors are possible there is state split again but the non-EOF state contains UnknownError.

In PossibleErrors the NoError state is possible. This indicates that the function failed (returned an error value) but the stream error flags are not set (can happen at fseek). There are 3 possible error values (EOF, "other" and no error), if after a feof there is UnknownError we know that the remaining 2 error types are possible.

balazske updated this revision to Diff 251640.Mar 20 2020, 7:39 AM

Simplified code.

Szelethus marked an inline comment as done.Apr 8 2020, 5:37 AM

Simplified code.

I can tell! The patch is amazing, the code practically reads itself aloud. I have some inlines but nothing major, the high level idea is great.

clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
106–113

This feels clunky to me.

Suppose that we're curious about the stream reaching EOF. We know that the last operation on the stream could cause EOF and ERROR. This function then would return FError. That doesn't really feel like the answer to the question "isErrorPossible".

In what contexts do you see calling this function that isn't like this?:

Optional<StreamState::ErrorKindTy> NewError =
    SS->isErrorPossible(ErrorKind);
if (!NewError) {
  // This kind of error is not possible, function returns zero.
  // Error state remains unknown.
  AddTransition(bindInt(0, State, C, CE), StreamState::UnknownError);
} else {
  // Add state with true returned.
  AddTransition(bindAndAssumeTrue(State, C, CE), ErrorKind);
  // Add state with false returned and the new remaining error type.
  AddTransition(bindInt(0, State, C, CE), *NewError);
}

Would it make more sense to move the logic of creating state transitions into a function instead?

492–525

It took me a while to realize that this also includes the case where the stat is not in any error :) Could you add a line of comment please?

534–537

I gave this plenty of thought, and I now believe that the state split here is appropriate. We really should ensure that the user checked both feof and ferror.

clang/test/Analysis/stream-error.c
48

That is a bit misleading, because Eof isn't EOF as specified in the standard. How about IsEof?

balazske marked an inline comment as done.Apr 8 2020, 7:08 AM
balazske added inline comments.
clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
106–113

The name of the function may be misleading. The non-empty Optional means "yes" and an empty means "no". And if yes additional information is returned in the optional value. If the transitions would be included in this function it needs more parameters. This function is for the part of the operation that computes the remaining error ( getRemainingPossibleError is a better name?). Later it will turn out in what way the function is used and if there is code repetition a new function can be introduced, but not now (the needed state transitions may be different).

balazske updated this revision to Diff 256023.Apr 8 2020, 7:53 AM
  • Rebase
  • Improved comments
  • Changed UnknownError to Unknown
  • Test changes (removed unneeded test)
Szelethus accepted this revision.Apr 9 2020, 6:20 AM

LGTM! You seem to have a firm grasp on where you want this checker to go, and I like everything about it.

The code needs seme clang-format treatment, and another eye might not hurt, but as far as I'm concerned, I'm super happy how this patch turned out.

clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
106–113

getRemainingPossibleError sounds great! You convinced me, if we ever need to stuff more things into this function, we will do that then.

This revision is now accepted and ready to land.Apr 9 2020, 6:20 AM
balazske updated this revision to Diff 256317.Apr 9 2020, 8:19 AM

Addressed review comments (reformat and rename).

This revision was automatically updated to reflect the committed changes.