Success or failure messages are now shown at all checked functions, if the call
(return value) is interesting.
Additionally new functions are added: open, openat, socket, shutdown
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
These code changes seem to be promising. Please upload results on the open source projects (like the ones that you uploaded previously), and I hope that after verifying those I we can finally merge this set of commits.
Results from the last run (the same applies as before, the run postfix _3 must be selected at top right corner):
memcached_1.6.8_stdclf_notetag_interesting_test_3 | Reports |
tmux_2.6_stdclf_notetag_interesting_test_3 | Reports |
curl_curl-7_66_0_stdclf_notetag_interesting_test_3 | Reports |
twin_v0.8.1_stdclf_notetag_interesting_test_3 | Reports |
vim_v8.2.1920_stdclf_notetag_interesting_test_3 | Reports |
openssl_openssl-3.0.0-alpha7_stdclf_notetag_interesting_test_3 | Reports |
sqlite_version-3.33.0_stdclf_notetag_interesting_test_3 | Reports |
ffmpeg_n4.3.1_stdclf_notetag_interesting_test_3 | Reports |
postgres_REL_13_0_stdclf_notetag_interesting_test_3 | Reports |
xerces_v3.2.3_stdclf_notetag_interesting_test_3 | Reports |
bitcoin_v0.20.1_stdclf_notetag_interesting_test_3 | Reports |
I looked through most of the open source results, and they look promising.
However I've seen one questionable tendency: there are many reports (e.g. this one) where the checker assumes that fileno(stdin), fileno(stdout) or fileno(stderr) fails. How realistic are these assumptions? Do we need to bother the programmer with these or should/can we silence them (and perhaps return the known fileno values corresponding to these standard streams)?
Another, concrete issue is this report where the analyzer assumes that ftell returns -1 (that gets converted to 2**64-1 because unsigned numbers are crazy), but there is no note tag (not even in the _3 run) and I don't see where does this assumption come from (although I didn't do a deep analysis).
Apart from these, the false positives are coming from general limitations of the engine that cannot be reasonably avoided.
The standard streams may need special handling, this can be useful for StreamChecker too. One problem is that the standard streams can be changed by the program, so we can not know for sure if these are the original values. Still it can be better to assume that fileno can not fail if used with the standard streams.
The result with ftell looks interesting, I checked this case already and I think the note tag is missing because there is a hidden conversion so the original symbol (that is set to interesting) is different than the real ftell function call.
Ok, then I think you can finally merge this commit and its two predecessors. I'm happy to see that this improvement is complete.
I hope that you can later add a special case for the standard streams, but I agree that it should be a separate commit.
Thanks for clarifying the ftell situation, that's clearly an engine issue that should not block this checker improvement. However, I'd be really glad to see a cleanup of the interestingness system (as a separate project), because its inconsistency leads to lots of very confusing bug reports.
It would be more simple to handle the standard streams in StreamChecker only. There it is possible to detect standard streams (should be variables with the known names) as arguments to functions. If StreamChecker eliminates the failure branch of fileno it will disappear from the analysis (order of checker callbacks does not matter, at the end only the correct branch remains). If stream checker is not enabled we will still get the failure for fileno(stdin). StdLibraryFunctionsChecker does not have a mechanism to detect special variables to arguments, probably it is possible to implement with a special type of argument constraint.
That's a reasonable plan, especially if we can bring that checker out of alpha in the foreseeable future. I like that it avoids code duplication.
About the "fileno with standard stream" problem: I do not see an always good solution for this case because these standard streams are global variables. If we want to be exact, we can not know much about these at analysis start unless it is the main function. The standard stream variables can be overwritten by the program or opened or closed, even if not, any unknown function can change the state of these. Because the possibility to change these, the file number can change.
At least it is possible to add a checker option for "the program does not manipulate standard streams". If this is true the standard streams are different values from all other opened streams and the file number is known. This can be the default value, most programs probably work this way.