Adding a new checker alpha.unix.ErrorReturn.
This check should implement SEI CERT C Coding Standard rule
"ERR33-C. Detect and handle standard library errors".
Differential D71510
[clang][checkers] Added new checker 'error-return-checker'. balazske on Dec 14 2019, 2:25 AM. Authored by
Details
Adding a new checker alpha.unix.ErrorReturn. This check should implement SEI CERT C Coding Standard rule
Diff Detail
Event TimelineComment Actions
Comment Actions Works relatively good now but not perfect. The tests are sometimes too strict so there are some false positives, for example this case: unsigned long X = strtoul("345", NULL, 10); if (X > 100) { // handle error } The result is not checked for ULONG_MAX but still the code is correct in this way. But we can not figure out the intention of the programmer to detect what is an "error handling code" to check for error handling branches. Other solution is to detect any branch condition that involves the value (returned from the function) as a "test for error return value" but this is probably not better. Comment Actions Great job, this seems to be progressing nicely! please see my comments inline.
Comment Actions
Comment Actions Checker was tested with tmux, no problems found (there are false positives but not easy to fix). Comment Actions Uh-oh, what happened here? Please don't post huge patches. 600 lines of code is huge. You could start off by implementing a single check (eg., NullErrorResultChecker) with a single library function and then add more checks and more functions in follow-up patches; this would have been much easier to review and would have made the community happy. Ok, here's how i see this:
I want to discuss step 5. Do we really need to go that far? Your solution is mathematically perfect (it probably isn't just by looking at the set of the checker callbacks that you've subscribed so far, but suppose it is). But is it actually good for the humanity to have the perfect solution? Do you really want to uncover all the places where the developer has intentionally suppressed the warning for the unused result? Do you really want to engage in an arms race with a developer who wants to silence the warning? Or would everybody be much happier if you simply re-used the implementation of __attribute__((warn_unused_result)) and treat its limitations as if it's the intended behavior? Like, i myself don't have an answer to this question. I kind of understand both sides. Your approach is more costly, as you'll have to re-investigate all the previous suppression sites, but it may uncover a few more bugs, as well as some accidental omissions that were accidentally suppressed (e.g., the return value was intentionally put into a variable, but the value was never read (which is btw also a dead store, so our other checker will find it)). On the other hand, I believe that a simple AST-based approach employed by warn_unused_result would be good enough for most practical purposes, it would have low false positive rate (something we really value a lot) and provide an intuitive suppression mechanism, and it will also be much easier to maintain (which really pleases me as a maintainer who'll have to fix bugs in your code once it's committed, but ideally it shouldn't outweigh the benefits of the user). I'd love to have you collect some data on this subject, and you're in a good position to do so, given that you already have a checker. If you run your checker on a lot of code, how many warnings will be non-trivially path-sensitive (i.e., the value isn't immediately discarded but dragged around for a bit, but still not checked)? Are these warnings valuable? How many of them highlight intentional suppressions? Comment Actions The checker was implemented in smaller parts that are visible in the commit list. I can split the patch at the commits (and figure out how to use git history manipulations and phabricator in a better way?). Comment Actions I wanted to implement the rules described here: Comment Actions This patch is here for code reference only. Although the whole checker does not work safe in this way the idea behind this approach can be useful. |
Value is not the name of the parameter, maybe a refactoring missed this comment.