This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
clang/lib/StaticAnalyzer/Checkers/
-
lib/
-
StaticAnalyzer/
-
Checkers/
25/28
StreamChecker.cpp

Differential D75356

[Analyzer][StreamChecker] Introduction of stream error state handling.
AbandonedPublic

Authored by balazske on Feb 28 2020, 8:17 AM.

Download Raw Diff

Details

Reviewers

Szelethus
NoQ
baloghadamsoftware
martong
xazax.hun
rnkovacs
dcoughlin

Summary

This is a way to handle stream error state in StreamChecker.
This is initial and work-in-progress,
only store of the error is implemented and create of error states
for function 'fseek'. This principle should work for other functions
and for testing if a function is called in error state.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

balazske created this revision.Feb 28 2020, 8:17 AM

Herald added a reviewer: Szelethus. · View Herald TranscriptFeb 28 2020, 8:17 AM

Herald added a project: Restricted Project. · View Herald Transcript

Herald added subscribers: cfe-commits, martong, Charusso and 9 others. · View Herald Transcript

balazske added a parent revision: D75163: [analyzer][StreamChecker] Adding precall and refactoring..Feb 28 2020, 8:18 AM

Harbormaster completed remote builds in B47594: Diff 247273.Feb 28 2020, 8:42 AM

Szelethus added inline comments.Mar 2 2020, 6:54 AM

clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
69–87	Shouldn't we merge this with `StreamState`?
91–124	Do you have other patches that really crave the need for this class? Why isn't `CallEvent::getReturnValue` sufficient? This is a legitimate question, I really don't know. :)

balazske marked 2 inline comments as done.Mar 2 2020, 7:18 AM

balazske added inline comments.

clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
69–87	The intention was that the error state is only stored when the stream is opened (in opened state). Additionally it exists in the map only if there is error, so no "NoError" kind is needed. This is only to save memory, if it is not relevant I can move the error information into `StreamState` (that will contain two enums then).
91–124	This is an "interesting" solution for the problem that there is need for a function with 3 return values. The constructor performs the task of the function: Create a conjured value (and get the various objects for it). The output values are RetVal and RetSym, and the success state, and the call expr that is computed here anyway. It could be computed independently but if the value was retrieved once it is better to store it for later use. (I did not check how costly that operation is.) I had some experience that using only `getReturnValue` and make constraints on that does not work as intended, and the reason can be that we need to bind a value for the call expr otherwise it is an unknown (undefined?) value (and not the conjured symbol)?

Szelethus added inline comments.Mar 2 2020, 10:13 AM

clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
69–87	I personally wouldn't worry about memory consumption in this case too much, considering how much information needs to be store for simple expressions, stream objects are relatively few and far in between, even on projects that use them a lot. Having one more enum in `StreamState` would be better in this case then! :)
91–124	I suspect that `getReturnValue` might only work in `postCall`, but I'm not sure. I think instead of this class, a function returning a `std::tuple` would be nicer, with `std::tie` on the call site. You seem to use all 3 returns values in the functions that instantiate `MakeRetVal` anyways :). In `StdLibraryFunctionsChecker`'s `evalCall`, the return value is explicitly constructed, and further constraints on it are only imposed in `postCall`. I wonder why that is. @martong, any idea why we don't `apply` the constraints for pure functions in `evalCall?`
347	According to the C'98 standard §7.19.9.2.5: After determining the new position, a successful call to the fseek function undoes any effects of the `ungetc` function on the stream, clears the end-of-file indicator for the stream, and then establishes the new position. After a successful fseek call, the next operation on an update stream may be either input or output. So it definitely doesn't clear the `EOF` flag on failure.

Szelethus added reviewers: NoQ, baloghadamsoftware, martong, xazax.hun, rnkovacs, dcoughlin.Mar 2 2020, 10:13 AM

balazske marked 2 inline comments as done.Mar 3 2020, 12:49 AM

balazske added inline comments.

clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
91–124	The return value case is not as simple because the `DefinedSVal` has no default constructor, but it is not as bad to return only the `RetVal` and have a `CE` argument.
347	Yes it does say nothing about what happens with EOF flag on failure, so it should be is better to not change it. And we do not know if it is possible to get an EOF error (seek to after the end of file?).

Removed MakeRetVal, fixed a bug in evalFseek.

evalFseek is to be updated further.

Harbormaster completed remote builds in B47875: Diff 247814.Mar 3 2020, 1:16 AM

Updated StreamState to include the error state.

balazske marked 2 inline comments as done.Mar 3 2020, 1:50 AM

Harbormaster completed remote builds in B47880: Diff 247824.Mar 3 2020, 2:38 AM

Szelethus added inline comments.Mar 3 2020, 2:40 AM

clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
31–33	Could you please move these to the individual enums please? :) I have an indexer that can query those as documentation.
37–40	These too. Also, I'm not yet sure whether we need `OtherError` and `AnyError`, as stated in a later inline.
91–124	I like the current solution very much!
333–335	If we check in `preCall` whether the stream is opened why don't we conservatively assume it to be open?
351–354	But why? The standard suggests that the error state isn't changed upon failure. I think we should leave it as-is.
438–439	Right here, should we just assume a stream to be opened when we can't prove otherwise? `ensureStreamOpened` is only called when we are about to evaluate a function that assumes the stream to be opened anyways, so I don't expect many false positive lack of `fclose` reports.

balazske marked 4 inline comments as done.Mar 3 2020, 5:43 AM

balazske added inline comments.

clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
333–335	If we do that we get a resource leak error for example in the test function `pr8081` (there is only a call to `fseek`). We can assume that if the function gets the file pointer as argument it does not "own" it so the close is not needed. Otherwise the false positive resource leaks come always in functions that take a file argument and use file operations. Or this case (the file pointer is passed as input argument, the file is not opened by the function that is analyzed) can be handled specially, we can assume that there is no error initially and closing the file is not needed.
351–354	The fseek can fail and set the error flag, see the example code here: https://en.cppreference.com/w/c/io/fseek Probably it can not set the EOF flag, according to that page it is possible to seek after the end of the file. So the function can succeed or fail with `OtherError`.
438–439	The warning is created only if we know that the stream is not opened. This function makes no difference if the stream is "tracked" (found in StreamMap) or not. In the not-tracked case it is the same as if it were opened. Probably the function can be changed to take a `StreamState` instead of `StreamVal` and the not-tracked case can be handled separately. Or this function can add the new `StreamState` in (opened state).

Moved enum comments.
Updated fseek behaviour.

balazske marked 5 inline comments as done.Mar 4 2020, 5:48 AM

balazske added inline comments.

clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
37–40	I plan to use `AnyError` for failing functions that can have EOF and other error as result. At a later `ferror` or `feof` call a new state split follows to differentiate the error (instead of having to add 3 new states after the function, for success, EOF error and other error). If other operation is called we know anyway that some error happened.
333–335	This can be done in a next change. It involves more changes at other places. I think of inserting the state for the stream if it was not there before. But we need to save if this was such an insert or a normal `fopen` (and do not report resource leak for the "insert" case).

Harbormaster completed remote builds in B48036: Diff 248155.Mar 4 2020, 6:11 AM

Okay, I think we're mostly in agreement so far -- could we implement a warning and add some test files for unchecked stream states after a failed fseek call?

The title of the revision is "[Analyzer][StreamChecker] Introduction of stream error state handling.", yet it is mostly about fseek, even if the intention is to demonstrate how error state handling would look like through its better modeling. How about we change the title to "[Analyzer][StreamChecker] Model fseek better by introducing stream error states"?

clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
37–40	I think it would still be better to introduce them as we find uses for them :) Also, to stay in the scope of this patch, shouldn't we only introduce `FseekError`? If we did, we could make warning messages a lot clearer.
333–335	This can be done in a next change. Consider me convinced :)
351–354	Yup, right, you won :) I tried some examples out on my system, and it could preserve or change the error state of the stream. To me it seems like that not checking the state of the stream after a failed `fseek` is surely unadvised. This still should be `AnyError` (or `FseekError`), as according to the `OtherError`'s description `OtherError` may not refer to `EOF`, yet after a failed `fseek` call the stream can be in `EOF`: $ cat test.cpp #include <cstdio> int main() { FILE *F = fopen("test.cpp", "r"); while (EOF != fgetc(F)) {} if (feof(F)) printf("The file is closed!\n"); // Return value is non-zero on failure. if (fseek(F, -100000, SEEK_END)) { if (feof(F)) printf("The file is still closed!\n"); else printf("The file is no longer closed!\n"); } } $ clang test.cpp && ./a.out The file is closed! The file is still closed!

balazske marked 8 inline comments as done.Mar 5 2020, 12:27 AM

balazske added inline comments.

clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp

37–40

This change is generally about introduction of the error handling, not specifically about fseek. Probably not fseek was the best choice, it could be any other function. Probably I can add another, or multiple ones, but then again the too-big-patch problem comes up. (If now the generic error states exist the diffs for adding more stream operations could be more simple by only adding new functions and not changing the StreamState at all.) (How is this related to warning messages?)

351–354

After some experimenting I think it is best to have every possibility of errors after fseek. This means, either EOF, or other-error, or non-EOF and not other-error (but failed fseek call according to return value). So we need 1 success and 2 error return value states, one error return with AnyError and one with NoError (strange but happens according to the shown output). Probably there is relation between the previous state and the produced result of fseek but I do not want to figure out, it may be different in other systems and the documentations say nothing.

This comes from this program:

#include <stdio.h>

void print_result(FILE *F, int rc, const char *errtxt) {
  printf("--------\n%s", errtxt);
  if (rc) {
    printf("failed...\n");
    if (feof(F)) {
      printf("FEOF\n");
    }
    if (ferror(F)) {
      printf("FERROR\n");
      perror("error");
    }
  } else {
    printf("success\n");
  }
}

int main() {
  FILE *F = fopen("fseek.c", "r");

  while (EOF != fgetc(F)) {}
  print_result(F, 1, "read done\n");

  // Return value is non-zero on failure.
  int rc = fseek(F, -100000, SEEK_END);
  print_result(F, rc, "seek invalid\n");

  rc = fseek(F, 2, SEEK_END);
  print_result(F, rc, "seek valid\n");

  rc = fseek(F, -100000, SEEK_END);
  print_result(F, rc, "seek invalid\n");
  
  fputs("str", F);
  print_result(F, 1, "failed operation\n");

  rc = fseek(F, -100000, SEEK_END);
  print_result(F, rc, "seek invalid\n");

  rc = fseek(F, -1, SEEK_END);
  print_result(F, rc, "seek valid\n");

  rc = fseek(F, -100000, SEEK_END);
  print_result(F, rc, "seek invalid\n");
}

And the result is:

--------
read done
failed...
FEOF
--------
seek invalid
failed...
FEOF
--------
seek valid
success
--------
seek invalid
failed...
--------
failed operation
failed...
FERROR
error: Bad file descriptor
--------
seek invalid
failed...
FERROR
error: Invalid argument
--------
seek valid
success
--------
seek invalid
failed...
FERROR
error: Invalid argument

balazske marked 2 inline comments as done.Mar 5 2020, 12:32 AM

balazske added inline comments.

clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
37–40	For now, the `EofError` and `OtherError` can be removed, in this change these are not used (according to how `fseek` will be done).

Szelethus added inline comments.Mar 5 2020, 3:48 AM

clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
37–40	This change is generally about introduction of the error handling, not specifically about fseek. Probably not fseek was the best choice, it could be any other function. You could not have picked a better function! Since the rules around the error state of the stream after a failed fseek call are quite complex, few functions deserve their own error state more. (How is this related to warning messages?) I had the following image in my head, it could be used at the bug report emission site to give a precise description: if (SS->isInErrorState()) { switch(SS.getErrorKind) { case FseekError: reportBug("After a failed fseek call, the state of the stream may " "have changed, and it might be feof() or ferror()!"); break; case EofError: reportBug("The stream is in an feof() state!"); break; case Errorf: reportBug("The stream is in an ferror() state!"); break; case OtherError: // We don't know what the precise error is, but we surely // know its in one. reportBug("The stream is in an error state!"); break; } (If now the generic error states exist the diffs for adding more stream operations could be more simple by only adding new functions and not changing the StreamState at all.) For the last case in the code snippet (`OtherError`), I'm not too sure what the conditions are -- when do we know what the stream state is (some sort of an error), but not know precisely how? In the case of `fseek`, we don't precisely know what the state is but we know how it came about. I just don't yet see why we need a generic error state. Probably I can add another, or multiple ones, but then again the too-big-patch problem comes up. I think the point of the patch is to demonstrate the implementation of an error state, not to implement them all, and it does it quite well! For now, the EofError and OtherError can be removed, in this change these are not used (according to how fseek will be done). I agree!

To avoid problems I created a new version of this diff too that follows after the other new ones:
https://reviews.llvm.org/D75682
(Adding a new diff can make inline comment positions even more inexact specially if the diff has many differences from an older one?)

martong added inline comments.Mar 6 2020, 6:37 AM

clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp
91–124	In StdLibraryFunctionsChecker's evalCall, the return value is explicitly constructed, and further constraints on it are only imposed in postCall. I wonder why that is. @martong, any idea why we don't apply the constraints for pure functions in evalCall? We could apply them in evalCall technically. I think the reason why we don't do that is the matter of implementation, and more importantly this way we are consequent with the traditional Hoare logic: {Pre}C{Post} as {Post} is done in postCall.

In D75356#1909193, @balazske wrote:

To avoid problems I created a new version of this diff too that follows after the other new ones:
https://reviews.llvm.org/D75682
(Adding a new diff can make inline comment positions even more inexact specially if the diff has many differences from an older one?)

I'd strongly prefer to finish this and move on after that -- we have the discussion here, and a great looking patch with only a few things to address.

Could you please fix up the dependencies of this revision?

I have "mirrored" all 3 changes in this stack to the new series in D75682. Probably it is possible to reuse these revisions instead but I do not know if it will not confuse phabricator somehow (and how phabricator behaves in such "tricky" cases, there is not a usable documentation for it). The D75682 is the one that should be used now, it is the same change as this one plus the ferror and feof functions and tests. The new part with ferror and feof can be done in a new change but these belong logically into this change to make the error handling complete and testable (this change in current form is not good for tests).

Szelethus added a comment.Mar 6 2020, 7:46 AM

This comment was removed by Szelethus.

In D75356#1909610, @balazske wrote:

The D75682 is the one that should be used now,

If this patch is supposed to be a followup to D75682, could you please mark it as such? I find these revisions difficult to navigate.

I have "mirrored" all 3 changes in this stack to the new series in D75682. Probably it is possible to reuse these revisions instead but I do not know if it will not confuse phabricator somehow (and how phabricator behaves in such "tricky" cases, there is not a usable documentation for it).

Since this is the patch where we held the discussion about error states, I think it would be better for this revision land first, that would also solve the problem of inlines being all over the place. It doesn't really matter whether we're introducing error states first through feof and ferror, or the admittedly quirky fseek. WDYT?

balazske abandoned this revision.Mar 9 2020, 8:00 AM

Szelethus mentioned this in D75851: [Analyzer][StreamChecker] Added evaluation of fseek..Mar 10 2020, 7:39 AM

Revision Contents

Path

Size

clang/

lib/

StaticAnalyzer/

Checkers/

StreamChecker.cpp

178 lines

Diff 247824

clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp

Show All 22 Lines

using namespace clang;		using namespace clang;
using namespace ento;		using namespace ento;
using namespace std::placeholders;		using namespace std::placeholders;

namespace {		namespace {

struct StreamState {		struct StreamState {
enum Kind { Opened, Closed, OpenFailed, Escaped } K;		// State of a stream symbol.
		// In any non-opened state the stream really does not exist.
StreamState(Kind k) : K(k) {}		// The OpenFailed means a previous open has failed, the stream is not open.
		SzelethusUnsubmitted Done Reply Inline Actions Could you please move these to the individual enums please? :) I have an indexer that can query those as documentation. Szelethus: Could you please move these to the individual enums please? :) I have an indexer that can query…
		enum KindTy { Opened, Closed, OpenFailed } State;
bool isOpened() const { return K == Opened; }
bool isClosed() const { return K == Closed; }		// The error state of a stream.
bool isOpenFailed() const { return K == OpenFailed; }		// NoError: No error flag is set or stream is not open.
// bool isEscaped() const { return K == Escaped; }		// EofError: EOF condition (feof returns true)
		// OtherError: other (non-EOF) error (ferror returns true)
bool operator==(const StreamState &X) const { return K == X.K; }		// AnyError: EofError or OtherError
		SzelethusUnsubmitted Done Reply Inline Actions These too. Also, I'm not yet sure whether we need `OtherError` and `AnyError`, as stated in a later inline. Szelethus: These too. Also, I'm not yet sure whether we need `OtherError` and `AnyError`, as stated in a…
		balazskeAuthorUnsubmitted Done Reply Inline Actions I plan to use `AnyError` for failing functions that can have EOF and other error as result. At a later `ferror` or `feof` call a new state split follows to differentiate the error (instead of having to add 3 new states after the function, for success, EOF error and other error). If other operation is called we know anyway that some error happened. balazske: I plan to use `AnyError` for failing functions that can have EOF and other error as result.
		SzelethusUnsubmitted Not Done Reply Inline Actions I think it would still be better to introduce them as we find uses for them :) Also, to stay in the scope of this patch, shouldn't we only introduce `FseekError`? If we did, we could make warning messages a lot clearer. Szelethus: I think it would still be better to introduce them as we find uses for them :) Also, to stay in…
		balazskeAuthorUnsubmitted Done Reply Inline Actions This change is generally about introduction of the error handling, not specifically about `fseek`. Probably not `fseek` was the best choice, it could be any other function. Probably I can add another, or multiple ones, but then again the too-big-patch problem comes up. (If now the generic error states exist the diffs for adding more stream operations could be more simple by only adding new functions and not changing the `StreamState` at all.) (How is this related to warning messages?) balazske: This change is generally about introduction of the error handling, not specifically about…
		balazskeAuthorUnsubmitted Done Reply Inline Actions For now, the `EofError` and `OtherError` can be removed, in this change these are not used (according to how `fseek` will be done). balazske: For now, the `EofError` and `OtherError` can be removed, in this change these are not used…
		SzelethusUnsubmitted Not Done Reply Inline Actions This change is generally about introduction of the error handling, not specifically about fseek. Probably not fseek was the best choice, it could be any other function. You could not have picked a better function! Since the rules around the error state of the stream after a failed fseek call are quite complex, few functions deserve their own error state more. (How is this related to warning messages?) I had the following image in my head, it could be used at the bug report emission site to give a precise description: if (SS->isInErrorState()) { switch(SS.getErrorKind) { case FseekError: reportBug("After a failed fseek call, the state of the stream may " "have changed, and it might be feof() or ferror()!"); break; case EofError: reportBug("The stream is in an feof() state!"); break; case Errorf: reportBug("The stream is in an ferror() state!"); break; case OtherError: // We don't know what the precise error is, but we surely // know its in one. reportBug("The stream is in an error state!"); break; } (If now the generic error states exist the diffs for adding more stream operations could be more simple by only adding new functions and not changing the StreamState at all.) For the last case in the code snippet (`OtherError`), I'm not too sure what the conditions are -- when do we know what the stream state is (some sort of an error), but not know precisely how? In the case of `fseek`, we don't precisely know what the state is but we know how it came about. I just don't yet see why we need a generic error state. Probably I can add another, or multiple ones, but then again the too-big-patch problem comes up. I think the point of the patch is to demonstrate the implementation of an error state, not to implement them all, and it does it quite well! For now, the EofError and OtherError can be removed, in this change these are not used (according to how fseek will be done). I agree! Szelethus: > This change is generally about introduction of the error handling, not specifically about…
		enum ErrorKindTy {
static StreamState getOpened() { return StreamState(Opened); }		NoError,
static StreamState getClosed() { return StreamState(Closed); }		EofError,
static StreamState getOpenFailed() { return StreamState(OpenFailed); }		OtherError,
static StreamState getEscaped() { return StreamState(Escaped); }		AnyError
		} ErrorState = NoError;
void Profile(llvm::FoldingSetNodeID &ID) const { ID.AddInteger(K); }
		bool isOpened() const { return State == Opened; }
		bool isClosed() const { return State == Closed; }
		bool isOpenFailed() const { return State == OpenFailed; }

		bool operator==(const StreamState &X) const {
		return State == X.State && ErrorState == X.ErrorState;
		}

		static StreamState getOpened() { return StreamState{Opened}; }
		static StreamState getClosed() { return StreamState{Closed}; }
		static StreamState getOpenFailed() { return StreamState{OpenFailed}; }
		static StreamState getOpenedWithError() {
		return StreamState{Opened, AnyError};
		}

		void Profile(llvm::FoldingSetNodeID &ID) const {
		ID.AddInteger(State);
		ID.AddInteger(ErrorState);
		}
};		};

class StreamChecker;		class StreamChecker;
struct FnDescription;		struct FnDescription;
using FnCheck = std::function<void(const StreamChecker , const FnDescription ,		using FnCheck = std::function<void(const StreamChecker , const FnDescription ,
const CallEvent &, CheckerContext &)>;		const CallEvent &, CheckerContext &)>;

using ArgNoTy = unsigned int;		using ArgNoTy = unsigned int;
static const ArgNoTy ArgNone = std::numeric_limits<ArgNoTy>::max();		static const ArgNoTy ArgNone = std::numeric_limits<ArgNoTy>::max();

struct FnDescription {		struct FnDescription {
FnCheck PreFn;		FnCheck PreFn;
FnCheck EvalFn;		FnCheck EvalFn;
ArgNoTy StreamArgNo;		ArgNoTy StreamArgNo;
};		};

/// Get the value of the stream argument out of the passed call event.		/// Get the value of the stream argument out of the passed call event.
/// The call should contain a function that is described by Desc.		/// The call should contain a function that is described by Desc.
SVal getStreamArg(const FnDescription *Desc, const CallEvent &Call) {		SVal getStreamArg(const FnDescription *Desc, const CallEvent &Call) {
assert(Desc && Desc->StreamArgNo != ArgNone &&		assert(Desc && Desc->StreamArgNo != ArgNone &&
"Try to get a non-existing stream argument.");		"Try to get a non-existing stream argument.");
		SzelethusUnsubmitted Done Reply Inline Actions Shouldn't we merge this with `StreamState`? Szelethus: Shouldn't we merge this with `StreamState`?
		balazskeAuthorUnsubmitted Done Reply Inline Actions The intention was that the error state is only stored when the stream is opened (in opened state). Additionally it exists in the map only if there is error, so no "NoError" kind is needed. This is only to save memory, if it is not relevant I can move the error information into `StreamState` (that will contain two enums then). balazske: The intention was that the error state is only stored when the stream is opened (in opened…
		SzelethusUnsubmitted Done Reply Inline Actions I personally wouldn't worry about memory consumption in this case too much, considering how much information needs to be store for simple expressions, stream objects are relatively few and far in between, even on projects that use them a lot. Having one more enum in `StreamState` would be better in this case then! :) Szelethus: I personally wouldn't worry about memory consumption in this case too much, considering how…
return Call.getArgSVal(Desc->StreamArgNo);		return Call.getArgSVal(Desc->StreamArgNo);
}		}

		/// Create a conjured symbol return value for a call expression.
		DefinedSVal makeRetVal(CheckerContext &C, const CallExpr *CE) {
		assert(CE && "Expecting a call expression.");

		const LocationContext *LCtx = C.getPredecessor()->getLocationContext();
		return C.getSValBuilder()
		.conjureSymbolVal(nullptr, CE, LCtx, C.blockCount())
		.castAs<DefinedSVal>();
		}

class StreamChecker		class StreamChecker
: public Checker<check::PreCall, eval::Call, check::DeadSymbols> {		: public Checker<check::PreCall, eval::Call, check::DeadSymbols> {
mutable std::unique_ptr<BuiltinBug> BT_nullfp, BT_illegalwhence,		mutable std::unique_ptr<BuiltinBug> BT_nullfp, BT_illegalwhence,
BT_UseAfterClose, BT_UseAfterOpenFailed, BT_ResourceLeak;		BT_UseAfterClose, BT_UseAfterOpenFailed, BT_ResourceLeak;

public:		public:
void checkPreCall(const CallEvent &Call, CheckerContext &C) const;		void checkPreCall(const CallEvent &Call, CheckerContext &C) const;
bool evalCall(const CallEvent &Call, CheckerContext &C) const;		bool evalCall(const CallEvent &Call, CheckerContext &C) const;
void checkDeadSymbols(SymbolReaper &SymReaper, CheckerContext &C) const;		void checkDeadSymbols(SymbolReaper &SymReaper, CheckerContext &C) const;

private:		private:
CallDescriptionMap<FnDescription> FnDescriptions = {		CallDescriptionMap<FnDescription> FnDescriptions = {
{{"fopen"}, {nullptr, &StreamChecker::evalFopen, ArgNone}},		{{"fopen"}, {nullptr, &StreamChecker::evalFopen, ArgNone}},
{{"freopen", 3},		{{"freopen", 3},
{&StreamChecker::preFreopen, &StreamChecker::evalFreopen, 2}},		{&StreamChecker::preFreopen, &StreamChecker::evalFreopen, 2}},
{{"tmpfile"}, {nullptr, &StreamChecker::evalFopen, ArgNone}},		{{"tmpfile"}, {nullptr, &StreamChecker::evalFopen, ArgNone}},
{{"fclose", 1},		{{"fclose", 1},
{&StreamChecker::preDefault, &StreamChecker::evalFclose, 0}},		{&StreamChecker::preDefault, &StreamChecker::evalFclose, 0}},
{{"fread", 4}, {&StreamChecker::preDefault, nullptr, 3}},		{{"fread", 4}, {&StreamChecker::preDefault, nullptr, 3}},
{{"fwrite", 4}, {&StreamChecker::preDefault, nullptr, 3}},		{{"fwrite", 4}, {&StreamChecker::preDefault, nullptr, 3}},
{{"fseek", 3}, {&StreamChecker::preFseek, nullptr, 0}},		{{"fseek", 3}, {&StreamChecker::preFseek, &StreamChecker::evalFseek, 0}},
{{"ftell", 1}, {&StreamChecker::preDefault, nullptr, 0}},		{{"ftell", 1}, {&StreamChecker::preDefault, nullptr, 0}},
{{"rewind", 1}, {&StreamChecker::preDefault, nullptr, 0}},		{{"rewind", 1}, {&StreamChecker::preDefault, nullptr, 0}},
{{"fgetpos", 2}, {&StreamChecker::preDefault, nullptr, 0}},		{{"fgetpos", 2}, {&StreamChecker::preDefault, nullptr, 0}},
		SzelethusUnsubmitted Done Reply Inline Actions Do you have other patches that really crave the need for this class? Why isn't `CallEvent::getReturnValue` sufficient? This is a legitimate question, I really don't know. :) Szelethus: Do you have other patches that really crave the need for this class? Why isn't `CallEvent…
		balazskeAuthorUnsubmitted Done Reply Inline Actions This is an "interesting" solution for the problem that there is need for a function with 3 return values. The constructor performs the task of the function: Create a conjured value (and get the various objects for it). The output values are RetVal and RetSym, and the success state, and the call expr that is computed here anyway. It could be computed independently but if the value was retrieved once it is better to store it for later use. (I did not check how costly that operation is.) I had some experience that using only `getReturnValue` and make constraints on that does not work as intended, and the reason can be that we need to bind a value for the call expr otherwise it is an unknown (undefined?) value (and not the conjured symbol)? balazske: This is an "interesting" solution for the problem that there is need for a function with 3…
		SzelethusUnsubmitted Done Reply Inline Actions I suspect that `getReturnValue` might only work in `postCall`, but I'm not sure. I think instead of this class, a function returning a `std::tuple` would be nicer, with `std::tie` on the call site. You seem to use all 3 returns values in the functions that instantiate `MakeRetVal` anyways :). In `StdLibraryFunctionsChecker`'s `evalCall`, the return value is explicitly constructed, and further constraints on it are only imposed in `postCall`. I wonder why that is. @martong, any idea why we don't `apply` the constraints for pure functions in `evalCall?` Szelethus: I suspect that `getReturnValue` might only work in `postCall`, but I'm not sure. I think…
		balazskeAuthorUnsubmitted Done Reply Inline Actions The return value case is not as simple because the `DefinedSVal` has no default constructor, but it is not as bad to return only the `RetVal` and have a `CE` argument. balazske: The return value case is not as simple because the `DefinedSVal` has no default constructor…
		SzelethusUnsubmitted Done Reply Inline Actions I like the current solution very much! Szelethus: I like the current solution very much!
		martongUnsubmitted Not Done Reply Inline Actions In StdLibraryFunctionsChecker's evalCall, the return value is explicitly constructed, and further constraints on it are only imposed in postCall. I wonder why that is. @martong, any idea why we don't apply the constraints for pure functions in evalCall? We could apply them in evalCall technically. I think the reason why we don't do that is the matter of implementation, and more importantly this way we are consequent with the traditional Hoare logic: {Pre}C{Post} as {Post} is done in postCall. martong: > In StdLibraryFunctionsChecker's evalCall, the return value is explicitly constructed, and…
{{"fsetpos", 2}, {&StreamChecker::preDefault, nullptr, 0}},		{{"fsetpos", 2}, {&StreamChecker::preDefault, nullptr, 0}},
{{"clearerr", 1}, {&StreamChecker::preDefault, nullptr, 0}},		{{"clearerr", 1}, {&StreamChecker::preDefault, nullptr, 0}},
{{"feof", 1}, {&StreamChecker::preDefault, nullptr, 0}},		{{"feof", 1}, {&StreamChecker::preDefault, nullptr, 0}},
{{"ferror", 1}, {&StreamChecker::preDefault, nullptr, 0}},		{{"ferror", 1}, {&StreamChecker::preDefault, nullptr, 0}},
{{"fileno", 1}, {&StreamChecker::preDefault, nullptr, 0}},		{{"fileno", 1}, {&StreamChecker::preDefault, nullptr, 0}},
};		};

void preDefault(const FnDescription *Desc, const CallEvent &Call,		void preDefault(const FnDescription *Desc, const CallEvent &Call,
CheckerContext &C) const;		CheckerContext &C) const;
void preFseek(const FnDescription *Desc, const CallEvent &Call,
		void evalFopen(const FnDescription *Desc, const CallEvent &Call,
CheckerContext &C) const;		CheckerContext &C) const;

void preFreopen(const FnDescription *Desc, const CallEvent &Call,		void preFreopen(const FnDescription *Desc, const CallEvent &Call,
CheckerContext &C) const;		CheckerContext &C) const;
		void evalFreopen(const FnDescription *Desc, const CallEvent &Call,
		CheckerContext &C) const;

void evalFopen(const FnDescription *Desc, const CallEvent &Call,		void preFseek(const FnDescription *Desc, const CallEvent &Call,
CheckerContext &C) const;		CheckerContext &C) const;
void evalFreopen(const FnDescription *Desc, const CallEvent &Call,		void evalFseek(const FnDescription *Desc, const CallEvent &Call,
CheckerContext &C) const;		CheckerContext &C) const;

void evalFclose(const FnDescription *Desc, const CallEvent &Call,		void evalFclose(const FnDescription *Desc, const CallEvent &Call,
CheckerContext &C) const;		CheckerContext &C) const;

/// Check that the stream (in StreamVal) is not NULL.		/// Check that the stream (in StreamVal) is not NULL.
/// If it can only be NULL a fatal error is emitted and nullptr returned.		/// If it can only be NULL a fatal error is emitted and nullptr returned.
/// Otherwise a new state where the stream is constrained to be non-null.		/// Otherwise a new state where the stream is constrained to be non-null.
ProgramStateRef ensureStreamNonNull(SVal StreamVal, CheckerContext &C,		ProgramStateRef ensureStreamNonNull(SVal StreamVal, CheckerContext &C,
ProgramStateRef State) const;		ProgramStateRef State) const;
Show All 29 Lines	if (!FD \|\| FD->getKind() != Decl::Function)
return nullptr;		return nullptr;

return FnDescriptions.lookup(Call);		return FnDescriptions.lookup(Call);
}		}
};		};

} // end anonymous namespace		} // end anonymous namespace

		// Store the state of the stream.
REGISTER_MAP_WITH_PROGRAMSTATE(StreamMap, SymbolRef, StreamState)		REGISTER_MAP_WITH_PROGRAMSTATE(StreamMap, SymbolRef, StreamState)

void StreamChecker::checkPreCall(const CallEvent &Call,		void StreamChecker::checkPreCall(const CallEvent &Call,
CheckerContext &C) const {		CheckerContext &C) const {
const FnDescription *Desc = lookupFn(Call);		const FnDescription *Desc = lookupFn(Call);
if (!Desc \|\| !Desc->PreFn)		if (!Desc \|\| !Desc->PreFn)
return;		return;

Show All 19 Lines	if (!State)
return;		return;
State = ensureStreamOpened(StreamVal, C, State);		State = ensureStreamOpened(StreamVal, C, State);
if (!State)		if (!State)
return;		return;

C.addTransition(State);		C.addTransition(State);
}		}

void StreamChecker::preFseek(const FnDescription *Desc, const CallEvent &Call,
CheckerContext &C) const {
ProgramStateRef State = C.getState();
SVal StreamVal = getStreamArg(Desc, Call);
State = ensureStreamNonNull(StreamVal, C, State);
if (!State)
return;
State = ensureStreamOpened(StreamVal, C, State);
if (!State)
return;
State = ensureFseekWhenceCorrect(Call.getArgSVal(2), C, State);
if (!State)
return;

C.addTransition(State);
}

void StreamChecker::preFreopen(const FnDescription *Desc, const CallEvent &Call,
CheckerContext &C) const {
// Do not allow NULL as passed stream pointer but allow a closed stream.
ProgramStateRef State = C.getState();
State = ensureStreamNonNull(getStreamArg(Desc, Call), C, State);
if (!State)
return;

C.addTransition(State);
}

void StreamChecker::evalFopen(const FnDescription *Desc, const CallEvent &Call,		void StreamChecker::evalFopen(const FnDescription *Desc, const CallEvent &Call,
CheckerContext &C) const {		CheckerContext &C) const {
ProgramStateRef State = C.getState();		ProgramStateRef State = C.getState();
SValBuilder &SVB = C.getSValBuilder();		const CallExpr *CE = dyn_cast_or_null<CallExpr>(Call.getOriginExpr());
const LocationContext *LCtx = C.getPredecessor()->getLocationContext();

auto *CE = dyn_cast_or_null<CallExpr>(Call.getOriginExpr());
if (!CE)		if (!CE)
return;		return;

DefinedSVal RetVal = SVB.conjureSymbolVal(nullptr, CE, LCtx, C.blockCount())		DefinedSVal RetVal = makeRetVal(C, CE);
.castAs<DefinedSVal>();
SymbolRef RetSym = RetVal.getAsSymbol();		SymbolRef RetSym = RetVal.getAsSymbol();
assert(RetSym && "RetVal must be a symbol here.");

State = State->BindExpr(CE, C.getLocationContext(), RetVal);		State = State->BindExpr(CE, C.getLocationContext(), RetVal);

// Bifurcate the state into two: one with a valid FILE* pointer, the other		// Bifurcate the state into two: one with a valid FILE* pointer, the other
// with a NULL.		// with a NULL.
ProgramStateRef StateNotNull, StateNull;		ProgramStateRef StateNotNull, StateNull;
std::tie(StateNotNull, StateNull) =		std::tie(StateNotNull, StateNull) =
C.getConstraintManager().assumeDual(State, RetVal);		C.getConstraintManager().assumeDual(State, RetVal);

StateNotNull = StateNotNull->set<StreamMap>(RetSym, StreamState::getOpened());		StateNotNull = StateNotNull->set<StreamMap>(RetSym, StreamState::getOpened());
StateNull = StateNull->set<StreamMap>(RetSym, StreamState::getOpenFailed());		StateNull = StateNull->set<StreamMap>(RetSym, StreamState::getOpenFailed());

C.addTransition(StateNotNull);		C.addTransition(StateNotNull);
C.addTransition(StateNull);		C.addTransition(StateNull);
}		}

		void StreamChecker::preFreopen(const FnDescription *Desc, const CallEvent &Call,
		CheckerContext &C) const {
		// Do not allow NULL as passed stream pointer but allow a closed stream.
		ProgramStateRef State = C.getState();
		State = ensureStreamNonNull(getStreamArg(Desc, Call), C, State);
		if (!State)
		return;

		C.addTransition(State);
		}

void StreamChecker::evalFreopen(const FnDescription *Desc,		void StreamChecker::evalFreopen(const FnDescription *Desc,
const CallEvent &Call,		const CallEvent &Call,
CheckerContext &C) const {		CheckerContext &C) const {
ProgramStateRef State = C.getState();		ProgramStateRef State = C.getState();

auto *CE = dyn_cast_or_null<CallExpr>(Call.getOriginExpr());		auto *CE = dyn_cast_or_null<CallExpr>(Call.getOriginExpr());
if (!CE)		if (!CE)
return;		return;
Show All 24 Lines	StateRetNotNull =
StateRetNotNull->set<StreamMap>(StreamSym, StreamState::getOpened());		StateRetNotNull->set<StreamMap>(StreamSym, StreamState::getOpened());
StateRetNull =		StateRetNull =
StateRetNull->set<StreamMap>(StreamSym, StreamState::getOpenFailed());		StateRetNull->set<StreamMap>(StreamSym, StreamState::getOpenFailed());

C.addTransition(StateRetNotNull);		C.addTransition(StateRetNotNull);
C.addTransition(StateRetNull);		C.addTransition(StateRetNull);
}		}

		void StreamChecker::preFseek(const FnDescription *Desc, const CallEvent &Call,
		CheckerContext &C) const {
		ProgramStateRef State = C.getState();
		SVal StreamVal = getStreamArg(Desc, Call);
		State = ensureStreamNonNull(StreamVal, C, State);
		if (!State)
		return;
		State = ensureStreamOpened(StreamVal, C, State);
		if (!State)
		return;
		State = ensureFseekWhenceCorrect(Call.getArgSVal(2), C, State);
		if (!State)
		return;

		C.addTransition(State);
		}

		void StreamChecker::evalFseek(const FnDescription *Desc, const CallEvent &Call,
		CheckerContext &C) const {
		ProgramStateRef State = C.getState();
		SymbolRef StreamSym = getStreamArg(Desc, Call).getAsSymbol();
		if (!StreamSym)
		return;

		const CallExpr *CE = dyn_cast_or_null<CallExpr>(Call.getOriginExpr());
		if (!CE)
		return;

		// Ignore the call if the stream is is not tracked.
		if (!State->get<StreamMap>(StreamSym))
		return;
		SzelethusUnsubmitted Done Reply Inline Actions If we check in `preCall` whether the stream is opened why don't we conservatively assume it to be open? Szelethus: If we check in `preCall` whether the stream is opened why don't we conservatively assume it to…
		balazskeAuthorUnsubmitted Done Reply Inline Actions If we do that we get a resource leak error for example in the test function `pr8081` (there is only a call to `fseek`). We can assume that if the function gets the file pointer as argument it does not "own" it so the close is not needed. Otherwise the false positive resource leaks come always in functions that take a file argument and use file operations. Or this case (the file pointer is passed as input argument, the file is not opened by the function that is analyzed) can be handled specially, we can assume that there is no error initially and closing the file is not needed. balazske: If we do that we get a resource leak error for example in the test function `pr8081` (there is…
		balazskeAuthorUnsubmitted Done Reply Inline Actions This can be done in a next change. It involves more changes at other places. I think of inserting the state for the stream if it was not there before. But we need to save if this was such an insert or a normal `fopen` (and do not report resource leak for the "insert" case). balazske: This can be done in a next change. It involves more changes at other places. I think of…
		SzelethusUnsubmitted Done Reply Inline Actions This can be done in a next change. Consider me convinced :) Szelethus: > This can be done in a next change. Consider me convinced :)

		DefinedSVal RetVal = makeRetVal(C, CE);

		// Make expression result.
		State = State->BindExpr(CE, C.getLocationContext(), RetVal);

		// Bifurcate the state into failed and non-failed.
		// Return zero on success, nonzero on error.
		ProgramStateRef StateNotFailed, StateFailed;
		std::tie(StateFailed, StateNotFailed) =
		C.getConstraintManager().assumeDual(State, RetVal);

		SzelethusUnsubmitted Done Reply Inline Actions According to the C'98 standard §7.19.9.2.5: After determining the new position, a successful call to the fseek function undoes any effects of the `ungetc` function on the stream, clears the end-of-file indicator for the stream, and then establishes the new position. After a successful fseek call, the next operation on an update stream may be either input or output. So it definitely doesn't clear the `EOF` flag on failure. Szelethus: According to the C'98 standard [[http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf\|§7.
		balazskeAuthorUnsubmitted Done Reply Inline Actions Yes it does say nothing about what happens with EOF flag on failure, so it should be is better to not change it. And we do not know if it is possible to get an EOF error (seek to after the end of file?). balazske: Yes it does say nothing about what happens with EOF flag on failure, so it should be is…
		// Reset the state to opened with no error.
		StateNotFailed =
		StateNotFailed->set<StreamMap>(StreamSym, StreamState::getOpened());
		// Set state to any error.
		// Assume that EOF and other error is possible after the call.
		StateFailed =
		StateFailed->set<StreamMap>(StreamSym, StreamState::getOpenedWithError());
		SzelethusUnsubmitted Done Reply Inline Actions But why? The standard suggests that the error state isn't changed upon failure. I think we should leave it as-is. Szelethus: But why? The standard suggests that the error state isn't changed upon failure. I think we…
		balazskeAuthorUnsubmitted Done Reply Inline Actions The fseek can fail and set the error flag, see the example code here: https://en.cppreference.com/w/c/io/fseek Probably it can not set the EOF flag, according to that page it is possible to seek after the end of the file. So the function can succeed or fail with `OtherError`. balazske: The fseek can fail and set the error flag, see the example code here: https://en.cppreference.
		SzelethusUnsubmitted Done Reply Inline Actions Yup, right, you won :) I tried some examples out on my system, and it could preserve or change the error state of the stream. To me it seems like that not checking the state of the stream after a failed `fseek` is surely unadvised. This still should be `AnyError` (or `FseekError`), as according to the `OtherError`'s description `OtherError` may not refer to `EOF`, yet after a failed `fseek` call the stream can be in `EOF`: $ cat test.cpp #include <cstdio> int main() { FILE F = fopen("test.cpp", "r"); while (EOF != fgetc(F)) {} if (feof(F)) printf("The file is closed!\n"); // Return value is non-zero on failure. if (fseek(F, -100000, SEEK_END)) { if (feof(F)) printf("The file is still closed!\n"); else printf("The file is no longer closed!\n"); } } $ clang test.cpp && ./a.out The file is closed! The file is still closed! Szelethus:* Yup, right, you won :) I tried some examples out on my system, and it could preserve or change…
		balazskeAuthorUnsubmitted Done Reply Inline Actions After some experimenting I think it is best to have every possibility of errors after `fseek`. This means, either EOF, or other-error, or non-EOF and not other-error (but failed fseek call according to return value). So we need 1 success and 2 error return value states, one error return with `AnyError` and one with `NoError` (strange but happens according to the shown output). Probably there is relation between the previous state and the produced result of `fseek` but I do not want to figure out, it may be different in other systems and the documentations say nothing. This comes from this program: #include <stdio.h> void print_result(FILE F, int rc, const char errtxt) { printf("--------\n%s", errtxt); if (rc) { printf("failed...\n"); if (feof(F)) { printf("FEOF\n"); } if (ferror(F)) { printf("FERROR\n"); perror("error"); } } else { printf("success\n"); } } int main() { FILE F = fopen("fseek.c", "r"); while (EOF != fgetc(F)) {} print_result(F, 1, "read done\n"); // Return value is non-zero on failure. int rc = fseek(F, -100000, SEEK_END); print_result(F, rc, "seek invalid\n"); rc = fseek(F, 2, SEEK_END); print_result(F, rc, "seek valid\n"); rc = fseek(F, -100000, SEEK_END); print_result(F, rc, "seek invalid\n"); fputs("str", F); print_result(F, 1, "failed operation\n"); rc = fseek(F, -100000, SEEK_END); print_result(F, rc, "seek invalid\n"); rc = fseek(F, -1, SEEK_END); print_result(F, rc, "seek valid\n"); rc = fseek(F, -100000, SEEK_END); print_result(F, rc, "seek invalid\n"); } And the result is: -------- read done failed... FEOF -------- seek invalid failed... FEOF -------- seek valid success -------- seek invalid failed... -------- failed operation failed... FERROR error: Bad file descriptor -------- seek invalid failed... FERROR error: Invalid argument -------- seek valid success -------- seek invalid failed... FERROR error: Invalid argument balazske:* After some experimenting I think it is best to have every possibility of errors after `fseek`.

		C.addTransition(StateNotFailed);
		C.addTransition(StateFailed);
		}

void StreamChecker::evalFclose(const FnDescription *Desc, const CallEvent &Call,		void StreamChecker::evalFclose(const FnDescription *Desc, const CallEvent &Call,
CheckerContext &C) const {		CheckerContext &C) const {
ProgramStateRef State = C.getState();		ProgramStateRef State = C.getState();
SymbolRef Sym = Call.getArgSVal(Desc->StreamArgNo).getAsSymbol();		SymbolRef Sym = Call.getArgSVal(Desc->StreamArgNo).getAsSymbol();
if (!Sym)		if (!Sym)
return;		return;

const StreamState *SS = State->get<StreamMap>(Sym);		const StreamState *SS = State->get<StreamMap>(Sym);
▲ Show 20 Lines • Show All 62 Lines • ▼ Show 20 Lines
ProgramStateRef StreamChecker::ensureStreamOpened(SVal StreamVal,		ProgramStateRef StreamChecker::ensureStreamOpened(SVal StreamVal,
CheckerContext &C,		CheckerContext &C,
ProgramStateRef State) const {		ProgramStateRef State) const {
SymbolRef Sym = StreamVal.getAsSymbol();		SymbolRef Sym = StreamVal.getAsSymbol();
if (!Sym)		if (!Sym)
return State;		return State;

const StreamState *SS = State->get<StreamMap>(Sym);		const StreamState *SS = State->get<StreamMap>(Sym);
if (!SS)		if (!SS)
return State;		return State;
		SzelethusUnsubmitted Done Reply Inline Actions Right here, should we just assume a stream to be opened when we can't prove otherwise? `ensureStreamOpened` is only called when we are about to evaluate a function that assumes the stream to be opened anyways, so I don't expect many false positive lack of `fclose` reports. Szelethus: Right here, should we just assume a stream to be opened when we can't prove otherwise?
		balazskeAuthorUnsubmitted Done Reply Inline Actions The warning is created only if we know that the stream is not opened. This function makes no difference if the stream is "tracked" (found in StreamMap) or not. In the not-tracked case it is the same as if it were opened. Probably the function can be changed to take a `StreamState` instead of `StreamVal` and the not-tracked case can be handled separately. Or this function can add the new `StreamState` in (opened state). balazske: The warning is created only if we know that the stream is not opened. This function makes no…

if (SS->isClosed()) {		if (SS->isClosed()) {
// Using a stream pointer after 'fclose' causes undefined behavior		// Using a stream pointer after 'fclose' causes undefined behavior
// according to cppreference.com .		// according to cppreference.com .
ExplodedNode *N = C.generateErrorNode();		ExplodedNode *N = C.generateErrorNode();
if (N) {		if (N) {
if (!BT_UseAfterClose)		if (!BT_UseAfterClose)
BT_UseAfterClose.reset(new BuiltinBug(this, "Closed stream",		BT_UseAfterClose.reset(new BuiltinBug(this, "Closed stream",
▲ Show 20 Lines • Show All 63 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[Analyzer][StreamChecker] Introduction of stream error state handling.AbandonedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 247824

clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp

[Analyzer][StreamChecker] Introduction of stream error state handling.
AbandonedPublic