Add a few functions for libc. The code in this patch that adds the summaries is
auto generated from Cppcheck's config directory
(https://github.com/danmar/cppcheck/blob/master/cfg/std.cfg). Note that not all
functions are lifted in this patch. Probably another patch will follow with more
summaries for libc.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
There a couple functional lines commented out, and we need more tests, so I take that the patch is a proof of concept for now? Because its pretty cool if so. :)
I think testing summaries this way can be really hard to manage in the future.
I see two possible ways forward to make this easier:
a) Make something like https://reviews.llvm.org/D78118 that will also dump the actual summary in a textual form, not only the fact that a summary was loaded and check for that textual form.
b) Make a small helper library for testing summaries. E.g., most of the buffer handling functions have similar summaries, so we could have only a couple of functions for testing buffer related summaries and we could just pass in function pointers (as templates or params) instead of duplicating code. Similarly, testing output ranges could be generalized.
I think b) might be a bit better solution.
Although I do see why some people like to keep tests simple without additional layers of abstractions, so I do not insist :)
My idea about the testing is the following:
We test each individual argument constraint by itself with a test function. E.g. in case of buffer sizes we have a test function specifically for it (and it requires a debug checker to be enabled)
addToFunctionSummaryMap( "__buf_size_arg_constraint", Summary(ArgTypes{ConstVoidPtrTy, SizeTy}, RetType{IntTy}, EvalCallAsPure) .ArgConstraint( BufferSize(/*Buffer=*/ArgNo(0), /*BufSize=*/ArgNo(1))));
And then in the test file we have the exact expectations:
int __buf_size_arg_constraint(const void *, size_t); void test_buf_size_concrete() { char buf[3]; // bugpath-note{{'buf' initialized here}} __buf_size_arg_constraint(buf, 4); // \ // report-warning{{Function argument constraint is not satisfied}} \ // bugpath-warning{{Function argument constraint is not satisfied}} \ // bugpath-note{{Function argument constraint is not satisfied}} } ...
This way we have a test coverage for each argument constraint. The summaries are composed from these argument constraints. There may be issues with the composition, so we do have some additional test cases for that, e.g we have a test to check the expectations when one function has more than one constraints:
int __two_constrained_args(int, int); void test_constraints_on_multiple_args(int x, int y) { // State split should not happen here. I.e. x == 1 should not be evaluated // FALSE. __two_constrained_args(x, y); clang_analyzer_eval(x == 1); // \ // report-warning{{TRUE}} \ // bugpath-warning{{TRUE}} \ // bugpath-note{{TRUE}} clang_analyzer_eval(y == 1); // \ // report-warning{{TRUE}} \ // bugpath-warning{{TRUE}} \ // bugpath-note{{TRUE}} }
So, we have tests for each individual building blocks and for their conjunctions, which renders the testing of all concrete summaries superfluous and bloated. It would be as if a compiler was tested for all possible inputs.
On top of all this, errors (assertions) with concrete summaries should be discovered by running the analysis on real world projects like tmux, bitcoin, etc.
clang/lib/StaticAnalyzer/Checkers/StdLibraryFunctionsChecker.cpp | ||
---|---|---|
1123–1124 | The three-way state split is unjustified here. Usage of abs is not a sufficient indication that the value may be 0, otherwise: int foo(int x, int y) { int z = abs(y); // Assuming 'abs' has taken branch on which y == 0... return x / z; // ... we'll be forced to emit a division by zero warning here. } Generally, there are very few cases when state splits upon function calls are justified. The common cases are:
|
clang/lib/StaticAnalyzer/Checkers/StdLibraryFunctionsChecker.cpp | ||
---|---|---|
1123–1124 |
Or something that kinda resembles bool (eg., isalpha() returns a variety of different ints in practice due to its static lookup table implementation strategy but the user only cares about whether it's zero or non-zero). |
clang/test/Analysis/std-c-library-functions.c | ||
---|---|---|
231 | Emm :) |
clang/lib/StaticAnalyzer/Checkers/StdLibraryFunctionsChecker.cpp | ||
---|---|---|
1123–1124 | Alright, I agree, I'll remove these cases. Generally speaking I realized that it is hard to create these cases, and it is very rare when we can come up with a meaningful case set for any function. So while we are at it, could you please take a look at another patch, where I add no cases just argument constraints: D82288 ? | |
clang/test/Analysis/std-c-library-functions.c | ||
231 | Ups, ouch :D |
clang/lib/StaticAnalyzer/Checkers/StdLibraryFunctionsChecker.cpp | ||
---|---|---|
1123–1124 | But it could make sense that if the input of the function is known before the call, we can compute the output? For abs(x) if we know that x == 0 is true the checker could assume abs(x) == 0 too. These "cases" provide data for this functionality. |
clang/lib/StaticAnalyzer/Checkers/StdLibraryFunctionsChecker.cpp | ||
---|---|---|
1123–1124 |
Yes, this is absolutely true. I think, we could approach this with introducing a postcall Kind for the summaries. Currently, we already have EvalCallAsPure and NoEvalCall for evalCall. We could have let's say enum PostCallKind { UnconditionedStateSplit, ConstrainReturnIfArgConstraintsAreSatisfied } For the ConstrainReturnIfArgConstraintsAreSatisfied we should check if the preconditions for the Args in the Case are all evaluated to Known and if so only would we apply the constraint on the return value. |
The three-way state split is unjustified here. Usage of abs is not a sufficient indication that the value may be 0, otherwise:
Generally, there are very few cases when state splits upon function calls are justified. The common cases are: