Page MenuHomePhabricator

[clang][analyzer] Add standard streams to alpha.unix.Stream checker.

Authored by balazske on Jul 23 2021, 3:18 AM.



Recognize initial value of standard streams (stdin, ...)
and assume that a new opened stream is not equal to these.

Diff Detail

Event Timeline

balazske created this revision.Jul 23 2021, 3:18 AM
balazske requested review of this revision.Jul 23 2021, 3:18 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 23 2021, 3:18 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript
martong added inline comments.Jul 23 2021, 6:05 AM

Would it be possible to reuse (i.e put in a common place) the StdLibraryFunctionChecker's lookupTy? That also handles the cases when FILEType is null or when we have a typedef struct FILE FILE;.

martong added inline comments.Jul 29 2021, 1:48 AM

Could you please consider the above question?

I like the idea, though I wonder whether evalAssume would be a better callback for this. That way, you'd only need to add an assumption when you reach a condition where one of the operands is standard.

Though it may be more trouble than its worth. I don't have strong feelings on this.


How about getStdStreamSymbol?

balazske updated this revision to Diff 362775.Jul 29 2021, 7:41 AM

Split some type lookup functions from StdLibraryFunctionsChecker into separate files.

Split some type lookup functions from StdLibraryFunctionsChecker into separate files.

Thank you, this is awesome!


We should be careful, to cache the results (either here, or deeper in the call stack).
I mean, we certainly don't want to do a lookup of "stdin" every time a function is evaluated. We should do this lazily, only once.

steakhal added inline comments.Aug 5 2021, 9:03 AM

Decl *D -> const Decl *D. Same for the other loop.


You calculate this 3 times now. I mean it's not a big deal, but we could save this to do it only once.


It will early return and uses one fewer continue.


I agree. We should do this only for the first top-level function, instead of doing this for every top-level function.


It might be a personal preference but I think lambdas should be pure in the sense that it takes something and produces something else.
Regardless of your choice, the trailing return type would highlight it.


SValBuilder::getConditionType(), oh they are the same under the hood. We should still probably prefer this one instead.
It might worth hoisting C.getSValBuilder() to a local variable though.
The symbol for StdStream also deserves a separate variable IMO.


I think you should be fine with castAs(). I'm not expecting this to fail.


How about checking this way instead?

clang_analyzer_eval(F != stdin);  // TRUE
clang_analyzer_eval(F != stdout); // TRUE
clang_analyzer_eval(F != stderr); // TRUE
balazske added inline comments.Aug 6 2021, 4:02 AM

The intent was that if no FILEType is found we find just the first VarDecl and do not check the type. This recommended change returns always null if no FILEType is found.


I am not sure if the SymbolRef values are safe to store between top-level function analyses. The FILE type and std stream declarations are safe to cache, but the SymbolRef values of these variables probably not.

balazske updated this revision to Diff 364787.Aug 6 2021, 7:07 AM

Init std declarations and FILE type only once.
Make a lambda not operate on captured data.
Simplify test.

balazske marked 8 inline comments as done.Aug 6 2021, 7:08 AM

Let's see if we can cache the stream symbols across top-level analyses.


Hm, it was not immediately clear to me.


I think it should be safe. But place there an assert and run the test suite. If it won't trigger, that means that we have high confidentiality that this is safe. I know it's not 100%, but pretty close.


This should be probably at the definition of StdinSym and getStdStreamSym().

balazske marked an inline comment as done.Aug 6 2021, 8:43 AM
balazske added inline comments.

I tried to check if these SymbolRef's are the same at checkBeginFunction after initialized once. The assertion for equality failed sometimes, or other crash happened when dump was called on the value. So it looks not safe to cache these. And should we add assumptions about that these StdinSym, StdoutSym, StderrSym are not equal to each other?


This lambda is possible to improve.

steakhal added inline comments.Aug 6 2021, 8:45 AM

Good to know. I don't think it's necessary to add assertions.

balazske updated this revision to Diff 365175.Aug 9 2021, 6:27 AM

Simplified a lambda usage.
Moved some functions, changed documentation.

steakhal accepted this revision.Aug 9 2021, 8:41 AM

It's fine on my part.

This revision is now accepted and ready to land.Aug 9 2021, 8:41 AM
martong added inline comments.Mon, Aug 30, 7:03 AM

Okay, so we should not cache the SymbolRefs then. But we must cache the VarDecls. I mean, we should avoid calling

StdinDecl = findStdStreamDecl("stdin", C);

more than once for each TU.

I think you could use and Optional for StdinDecl (and the others) to achieve this.