Warnings and notes of checker alpha.unix.StdLibraryFunctionArgs are
improved. Previously one warning and one note was emitted for every
finding, now one warning is emitted only that contains a detailed
description of the found issue.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Awesome, been a long time coming!!
Other than the minor observation of changing "of function" to "to", I'm inclined to accept this patch. We definitely should describe what the value IS, not just what is should be (aside from the cases concerning nullness, I think they are fine as-is), but seems to be another can of worms deserving of its own patch.
The reason why I'd solve the description of the argument separately is that other checkers also suffer from this problem (alpha.security.ArrayBoundV2, hello), and it opens conversations such as "What if we can't sensibly describe that value?", and on occasions, we have felt that the go-to solution should be just suppressing the warning. But then again, its an intentional false negative.
clang/lib/StaticAnalyzer/Checkers/StdLibraryFunctionsChecker.cpp | ||
---|---|---|
907–909 | Looking at the tests, the word 'function' may be redundant and unnecessary. | |
924–945 | Here too. | |
958–961 | Here too. | |
972 | We are saying what the value should be, but neglect to say what it is instead. Something like this would be good: "The value of the 1st argument to _nonnull should be equal to or less then 10, but is 11". This is what I (and many of our users IIRC) miss in checker messages, like in those where we are supposed to describe what the value of the index is (besides it causing a buffer overflow). |
I had the plan to add a note that tells the value that is out of range. In the most simple case it can work if we have a fixed value. But this may be the most obvious from the code too. Otherwise a new function can be added that can print the assumed constraint ranges.
Probably it is not always useful to explain why the argument is wrong. In cases when we can find out that the value is exactly between two consecutive valid ranges we can display a note, or when the exact value is known. Otherwise it may end up in something like "the value should be between 1 and 4 or between 6 and 10, but it is less than 1 or exactly 5 or greater than 10". The "good" cases are (when more information is available): "the value is less than 1", "the value is 5", "the value is greater than 10". If two bad ranges are known it may be OK too: "the value is less than 1 or it is exactly 5". The explanation may be possible to implement by test assumes for the negated ranges.
A small nit, otherwise LGTM.
You're totally right. I'm not sure how to approach tricky cases like that either.
The "good" cases are (when more information is available): "the value is less than 1", "the value is 5", "the value is greater than 10". If two bad ranges are known it may be OK too: "the value is less than 1 or it is exactly 5". The explanation may be possible to implement by test assumes for the negated ranges.
Maybe we can start out with the easier cases, but we should check whether there is a checker that solves a different problem, and should try to aim at a somewhat general, reusable solution.
clang/lib/StaticAnalyzer/Checkers/StdLibraryFunctionsChecker.cpp | ||
---|---|---|
95 | Using a raw_ostream as a parameter sounds more elegant than a SmallString with a precise stack buffer length. Not to mention that you could call this function with llvm::errs() for easy debugging. | |
100 | Here as well. | |
clang/test/Analysis/std-c-library-functions-arg-constraints-notes.cpp | ||
51–79 | Please clang-format the changes in this test file. |
I changed the text descriptions to relational operators at some places, this is similar to how other clang analyzer messages look (for example notes at if statements), it is shorter, and probably more faster to understand ("<= 0" instead of "non-positive").
clang/lib/StaticAnalyzer/Checkers/StdLibraryFunctionsChecker.cpp | ||
---|---|---|
95 | I want to improve this in a later patch. The change does involve not new code too (getArgDesc should be changed too and other places where the messages are generated). |
Using a raw_ostream as a parameter sounds more elegant than a SmallString with a precise stack buffer length. Not to mention that you could call this function with llvm::errs() for easy debugging.