This checker can be good enough to move out of alpha.
I am not sure about the exact requirements, this review can be a place
for discussion about what should be fixed (if any).
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
I could test the checker on these projects (CTU analysis was not used):
memcached,tmux,curl,twin,vim,openssl,sqlite,ffmpeg,postgres,tinyxml2,libwebm,xerces,bitcoin,protobuf,qtbase,contour,acid
These are reports that could be improved:
link
In this case function fileno returns -1 because of failure, but this is not indicated in a NoteTag. This is a correct result, only the note is missing. This problem can be solved if a note is displayed on every branch ("case") of the standard C functions. But this leads to many notes at un-interesting places. If the note is displayed only at "interesting" values another difficulty shows up: The note disappears from places where it should be shown because the "interestingness" is not set, for example at conditions of if statement. So the solution may require more work. This case with function fileno occurs 13 times in all the tested projects.
link
The function open is not modeled in StdCLibraryFunctionsChecker, it should not return less than -1 but this information is not included now.
link
This looks wrong, L should not be 0 because len looks > 0 (see the macros that set len). Probably the many bitwise operations cause the problem.
link
socket can not return less than -1 but this function is not modeled currently.
link
fwrite with 0 buffer and 0 size should not be an error, this is not checked now.
link
When file_size is 0 status.ok() is probably false that is not correctly recognized (may work in CTU mode?).
These results look good:
link
link
link
link
link
link
link
link
In this last case it looks like that previous call to ftell returns -1, this value is assigned to fileSize. This is again a case for improvement similar as the case with fileno.
One deficiency is that some filenames of test files contain the old std-c-library-functions-arg name that is not used any more.
I am not sure about the exact requirements, this review can be a place for discussion about what should be fixed (if any).
D52984 added the "Making your checker better" section to the dev manual: https://clang-analyzer.llvm.org/checker_dev_manual.html (nobody can be faulted for not finding this, aside from those that witnessed its creation, it has faded from the collective memory of the analyzer developers).
Yeah, this is a tough cookie... is it okay to find hide the -1 branch behind an off-by-default checker option for the time being?
link
The function open is not modeled in StdCLibraryFunctionsChecker, it should not return less than -1 but this information is not included now.
link
socket can not return less than -1 but this function is not modeled currently.
These should be a rather painless fix, right?
link
This looks wrong, L should not be 0 because len looks > 0 (see the macros that set len). Probably the many bitwise operations cause the problem.
link
When file_size is 0 status.ok() is probably false that is not correctly recognized (may work in CTU mode?).
Looks like something we can live with.
link
fwrite with 0 buffer and 0 size should not be an error, this is not checked now.
Some discussion for that: D140387#inline-1360054. There is a FIXME in the code for it -- not sure how common this specific issue is, but we did stumble on it in an open source project... how painful would it be to fix this?
These results look good:
link
link
link
link
link
link
link
link
In this last case it looks like that previous call to ftell returns -1, this value is assigned to fileSize. This is again a case for improvement similar as the case with fileno.
Could you elaborate on what do you mean by "The note disappears from places where it should be shown because the "interestingness" is not set, for example at conditions of if statement.". A short example would do the job I think.
I looked at the TPs, and if the violation was introduced by an assumption (instead of an assignment), then it's really hard to spot which assumption is important for the bug.
I wonder if we could add the TrackConstraintBRVisitor to the bugreport to "highlight" that particular assumption/place.
The question is first if this problem must be fixed before the checker comes out of alpha state. If yes I try to make another patch with this fix. I tried this previously but do not remember exactly what the problem was.
clang/docs/analyzer/checkers.rst | ||
---|---|---|
922 | This is applicable to C++ too? |
WIthout an explicit note message there, I don't see how could we advertise this as a "mature" checker.
clang/docs/analyzer/checkers.rst | ||
---|---|---|
922 | Yes it's applicable to c++ if they use these C APIs. |
Is it possible to hide functions hindered by this problem behing an off-by-default flag?
It is possible to add note tags to show decisions at standard functions. For example at fileno show if it has failed or not failed. The most simple way is to add it to all places, this means a note will show up on any bug path at all standard function usages. This is how it works already with the existing notes. Like in the following code:
int __test_case_note(); int test_case_note_1(int y) { int x1 = __test_case_note(); // expected-note{{Function returns 1}} int x = __test_case_note(); // expected-note{{Function returns 0}} \ // expected-note{{'x' initialized here}} return y / x; // expected-warning{{Division by zero}} \ // expected-note{{Division by zero}} } int test_case_note_2(int y) { int x = __test_case_note(); // expected-note{{Function returns 1}} return y / (x - 1); // expected-warning{{Division by zero}} \ // expected-note{{Division by zero}} }
Here the first note at line with "x1" is not necessary. This problem can be fixed if the note is only shown when the return value is "interesting":
int __test_case_note(); int test_case_note_1(int y) { int x1 = __test_case_note(); // no note int x = __test_case_note(); // expected-note{{Function returns 0}} \ // expected-note{{'x' initialized here}} return y / x; // expected-warning{{Division by zero}} \ // expected-note{{Division by zero}} } int test_case_note_2(int y) { int x = __test_case_note(); // no note return y / (x - 1); // expected-warning{{Division by zero}} \ // expected-note{{Division by zero}} }
But in this case the note at test_case_note_2 disappears because x-1 is interesting, but not x. Fixing this problem looks more difficult.
From these two solutions, which one is better? (Show many unnecessary notes, or show only necessary ones but lose some of the useful notes too.)
How likely are the problems with the 2nd case? I know its a hassle, but can you upload results for the first and the second case? Its hard to tell without seeing how these pan out in the real world.
I think we must avoid polluting the environment with unnecessary notes (especially if we'd emit "many" of them), so I strongly suggest option 2.
Later it would be possible to fix the limitations of the interestingness system, e.g. by introducing a "mark all symbols in this expression as interesting" function. Perhaps we should open a separate discussion about this on discourse -- but this review and the de-alpha-ing of StdCLibraryFunctions should not be delayed by this tangentially related engine improvement!
Personally I think it's completely acceptable if the analyzer sometimes emits bug reports that are true positives but lack a message like "expected-note{{Function returns 1}}" as I'd guess that in the majority of these cases it wouldn't be terribly difficult for the user to manually derive this information from the context. (I'd say that even the previously highlighted fileno report is acceptable -- it's surprising at first, but all bug reports come with an implicit "On a certain execution path..." prefix and in this case it's easy to see that "certain" = "when fileno returns -1".)
If these issues produce lots of issues that are very confusing, then we should put the affected functions behind off-by-default flags, finish this review process, and revisit them later when the interestingness system is improved; but based on the available information I don't think that's necessary.
I resign as a reviewer as I'm not deeply connected to this checker, thus I won't block it or accept it.
However, my opinion is that a checker should be "released" if they have clear diagnostics (which includes that it doesn't flood the user with unimportant diagnostics either).
Consequently, if there are features missing to accomplish that, then that thing is a blocker.
TBH I never understood why interestingness is not transitive over the SymExpr dependencies (symbols_begin/end).
This was not the only case when it hindered us. Just think of how taint propagates.
I must admit that I'm in the other camp.
If these issues produce lots of issues that are very confusing, then we should put the affected functions behind off-by-default flags, finish this review process, and revisit them later when the interestingness system is improved; but based on the available information I don't think that's necessary.
I can agree with this pragmatic approach.
To clarify my position the "sometimes" in my comment should've been "rarely" or something closer to that. I don't want to release confusing garbage, I'm just trying to optimize sum(helpfulness of checkers) and I feel that the last few steps towards perfect helpfulness are often disproportionately expensive. If a code contains two bugs, then two rough but understandable warnings are more valuable than one well-polished friendly warning + one completely missed bug (because we didn't have time to write a checker for it).
Do I understand it correctly that the "create note that's shown when the return value is marked as interesting" (the option 2 that I suggested) adds / will add clarifying notes to the fileno/fstat error reports and the ftell issue? Did you see any real-life examples where this option 2 does not provide useful notes? If not, then I think we can accept that option 2 is a sufficient solution for these issues.
In addition to this, there are these issues noted by @Szelethus:
If it's not too difficult, then please upload a new version of this patch that implements "option 2" (i.e. produces notes when the return value of the std library function is marked as interesting) + handles the three requests of @Szelethus. I hope that you can do this and then this review can be concluded quickly.
If there are significant obstacles (or I misunderstood the situation), then please reply and discuss the next steps.
Uh-oh, looks like I'm not paying nearly enough attention to this discussion (sorry about that!!)
I'm somewhat skeptical of the decision made in D151225 because the entire reason I originally implemented StdCLibraryFunctions was to deal with false positives I was seeing. It was really valuable even without the bug-finding part. So I really wish we could find some way to keep bug-finding and modeling separate.
I haven't read the entire discussion though, I need to catch up 😓
In this case function fileno returns -1 because of failure, but this is not indicated in a NoteTag. This is a correct result, only the note is missing. This problem can be solved if a note is displayed on every branch ("case") of the standard C functions. But this leads to many notes at un-interesting places. If the note is displayed only at "interesting" values another difficulty shows up: The note disappears from places where it should be shown because the "interestingness" is not set, for example at conditions of if statement. So the solution may require more work. This case with function fileno occurs 13 times in all the tested projects.
Extra notes along the path are fine as long as their isPrunable() flag is correctly set. It's perfectly ok to say "(15) Hey btw we've just ran over this statement and here's what we assume it did" in a section of a path that's already displayed to the user. Even if you say that on every line, it's probably ok.
But if the note causes a new nested stack frame to be displayed (which was otherwise pruned from the report), there better be a good reason for this, because this can easily increase the complexity of the bug report by a factor of 100.
It's definitely a requirement for a non-alpha checker to make sure that there are enough non-prunable pieces in the report for the user to understand the report. A lot of our existing on-by-default checkers (and even some core checkers) don't really hold up to this expectation (looking at you null dereference), but almost every time they don't, that's a false positive. If you need to pass -analyzer-config prune-paths=false in order to see a key step in the bug report, that's a false positive even if your path simulation was perfect.
So if any of these notes are essential to understanding any bug report (by this checker or by another checker, eg. like getenv() returning null is essential for null dereference checker), there needs to be a way for the note tag to learn that (eg., the getenv() should be able to ask whether the return value is being tracked by trackExpressionValue()) and if so, the note has to be marked as unprunable.
For first experiment I have made patch D153612 that adds a NoteTag to "all" standard function calls.
The problem was that modeling and report generation could not be separated correctly. Both are implemented in one class but are differently named checkers that should run in a specific order because dependency issues, this was not good. Other problem was that if the modeling checker runs first, it will apply state changes for pre and post conditions without generating a bug report even if a bug could be found in the previous state. The old state is then lost and other checkers will not find that bug. For example a case of null pointer argument to a function is always removed by the modeling part of the checker, even if this was a case when a bug report should be generated.
In my view, it would certainly be possible through enormous efforts to further granularize this checker (or these large ones in general), so that the modeling and reporting portions would could be cleanly separated into their own checker objects. That certianly was my belief a couple years back -- I sank months and months into MallocChecker, yet I'm still not even close to that goal.
So, with the modeling and the reporting being the same entity, we can't express that some more specific checkers should run before it. StreamChecker can construct more specific messages thatn StdLibraryFunctions for a null stream object, but only if it runs ahead of it. That implies a both a weak and a strong dependency on what is essentially the same checker. As things stand, not sure how we could have avoided this if we want these checkers to finally leave the alpha state.
Could you post the results for it as you have them please?
Why is this checker placed in the "unix" group (instead of e.g. "core")? As far as I understand it can check some POSIX-specific functions with a flag (that's off by default) but the core functionality checks platform-independent standard library functions.
Moreover, if you can, please provide links to some test result examples on open source projects (either run some fresh tests or provide links to an earlier test run that was running with the same functionality).
This checker was originally in the "unix" package. I agree that this is not exact and core can be better, the checked functions should be available in any default C library on UNIX, OSX, Windows or other platforms too, even the POSIX ones at least in some cases. This applies even to the other related checkers alpha.unix.Stream and alpha.unix.Errno.
I have checked the results on some projects (memcached,tmux,curl,twin,vim,openssl,sqlite,ffmpeg,postgres,xerces,bitcoin).
These results are more interesting, some look correct, some probably not:
https://codechecker-demo.eastus.cloudapp.azure.com/Default/report-detail?run=curl_curl-7_66_0_stdclibraryfunctions_alpha&is-unique=on&diff-type=New&checker-name=unix.StdCLibraryFunctions&report-id=2243964&report-hash=d4a4bda38c5a6fdaabe2c1867158b106&report-filepath=%2atftpd.c
https://codechecker-demo.eastus.cloudapp.azure.com/Default/report-detail?run=ffmpeg_n4.3.1_stdclibraryfunctions_alpha&is-unique=on&diff-type=New&checker-name=unix.StdCLibraryFunctions&report-hash=908f965d980d60292af95db0fa10cd5f&report-id=2252082&report-filepath=%2av4l2_buffers.c
https://codechecker-demo.eastus.cloudapp.azure.com/Default/report-detail?run=postgres_REL_13_0_stdclibraryfunctions_alpha&is-unique=on&diff-type=New&checker-name=unix.StdCLibraryFunctions&report-hash=914e79646cb0de40dab434ba24c8c23c&report-id=2259781&report-filepath=%2adsm_impl.c
https://codechecker-demo.eastus.cloudapp.azure.com/Default/report-detail?run=postgres_REL_13_0_stdclibraryfunctions_alpha&is-unique=on&diff-type=New&checker-name=unix.StdCLibraryFunctions&report-hash=58d8278be40f99597b44323d2574c053&report-id=2259789&report-filepath=%2asyslogger.c
https://codechecker-demo.eastus.cloudapp.azure.com/Default/report-detail?run=postgres_REL_13_0_stdclibraryfunctions_alpha&is-unique=on&diff-type=New&checker-name=unix.StdCLibraryFunctions&report-hash=1928ba718d9742340937d425ec3978c6&report-id=2260011&report-filepath=%2apg_backup_custom.c
https://codechecker-demo.eastus.cloudapp.azure.com/Default/report-detail?run=bitcoin_v0.20.1_stdclibraryfunctions_alpha&is-unique=on&diff-type=New&checker-name=unix.StdCLibraryFunctions&report-hash=6ad3a20f18f2850293b4cdd867e404e2&report-id=2266103&report-filepath=%2aenv_posix.cc
Correct but interesting, the note about failure of ftell is shown:
https://codechecker-demo.eastus.cloudapp.azure.com/Default/report-detail?run=xerces_v3.2.3_stdclibraryfunctions_alpha&is-unique=on&diff-type=New&checker-name=unix.StdCLibraryFunctions&report-hash=4ab640064066880ac7031727869c92f4&report-id=2260149&report-filepath=%2aThreadTest.cpp
I did not find results that are obvious false positive.
Many results are the case when fileno returns -1 and this value is used without check. The checker generates a note about failure of fileno. For example at these results:
https://codechecker-demo.eastus.cloudapp.azure.com/Default/reports?run=postgres_REL_13_0_stdclibraryfunctions_alpha&is-unique=on&diff-type=New&checker-name=unix.StdCLibraryFunctions
(There are cases when fileno(stderr) is assumed to fail. This case can be eliminated if the StreamChecker is enabled, after an improvement of the checker. But for this StreamChecker must run before StdCLibraryFunctionsChecker?)
Thanks for the sample reports.
I'm fine if we want to make it a non-alpha (released) checker.
An orthogonal question is, whether we want to have it under the code package.
I'm not sure there are official guidance for elevating a checker to code, but here are my assumptions.
To me, these are the questions to see how valuable our reports are for the user.
- How many issues does it raise? Would we flood the user?
- How "interesting" those issues are? Do they have *actual* value for the user? (Not only niece edge-cases, that is fancy to know about, but actual users would genuinely commit such mistakes)
- How long those bug-paths are in practice? I'd argue, the longer they are, usually the less actionable they are for the user. Less actionable reports are also less valuable, or even harmful.
- In general, how understandable these reports are? Do we have all the interesting "notes" or "events" on the path?
Please, reflect on these questions and argue why we should have diagnostics of this kind?
Note that, I'm (probably) fine enabling modeling parts by default,but having diagnostics by default is another thing.
Additionally, it might make sense to first "release" the checker, and after another llvm release, turn this into a Core checker.
It doesn't have to be in core just because it's dealing with "core features" of the language.
The core package is for checks without which path-sensitive analysis becomes so incredibly incorrect that we don't want to support such configuration, we don't want our users to ever use it.
(Ideally core checkers shouldn't emit any warnings at all. It has to be possible to disable every individual warning and still have other warnings work correctly. But unfortunately that's not the reality of the situation.)
(I agree unix doesn't sound right though. But we already have MallocChecker in unix, which is arguably way worse. What we really need is, to replace our package system with a hashtag system.)
About the questions:
- How many issues does it raise? Would we flood the user?
I did not experience that the checker produces many warnings. Any warning from this checker is connected to a function call of a standard API, and the number of such calls is usually not high. Typically one
problem which the checker reports can occur often in a specific program, for example the fileno case (fileno returns -1 at failure, often this failure is not handled and value -1 is used as a file number).
This should not be a case of hundreds of warnings.
- How "interesting" those issues are? Do they have *actual* value for the user? (Not only niece edge-cases, that is fancy to know about, but actual users would genuinely commit such mistakes)
If the coder cares about all edge-cases of API calls, these are real and important issues. More often most of the results are just cases of ignored errors that are very rare, the programmer probably intentionally did not handle these because it is not worth for a such rare situation. From security point of view these cases can be used to find places where it is possible to make an API call (which normally "never" fails) intentionally fail and produce unexpected behavior of the program. So for an average application many results are not very important, for stability and security critical code the results can be more important.
- How long those bug-paths are in practice? I'd argue, the longer they are, usually the less actionable they are for the user. Less actionable reports are also less valuable, or even harmful.
The bug path can be long, often only the very last part is important, but sometimes not.
- In general, how understandable these reports are? Do we have all the interesting "notes" or "events" on the path?
These should be not more difficult to understand than a division by zero, only with a function call instead of division.
As I don't use this checker, thus I cannot speak of my experience.
However, the reasons look solid and I'm fine with moving this checker to unix.StdCLibraryFunctions.
Let some time for other reviewers to object before landing this. Lets say, one week from now.
@xazax.hun @NoQ?
As next steps, I'd be glad set the default value of ModelPOSIX to true. I don't see much harm doing so.
(And maybe getting rid of that checker option entirely.)
clang/docs/analyzer/checkers.rst | ||
---|---|---|
979–982 |
This is applicable to C++ too?