There are certain unsafe or deprecated (since C11) buffer handling
functions which should be avoided in safety critical code. They
could cause buffer overflows. Two new checks had been implemented
which warn for every occurrence of such functions
(unsafe or deprecated printf, scanf family, and other buffer
handling functions, which now have a secure variant).
Details
- Reviewers
zaks.anna dcoughlin NoQ xazax.hun Szelethus george.karpenkov - Commits
- rG8d239996392c: [analyzer] New checker for detecting usages of unsafe I/O functions
rC353698: [analyzer] New checker for detecting usages of unsafe I/O functions
rL353698: [analyzer] New checker for detecting usages of unsafe I/O functions
Diff Detail
Event Timeline
This does not do anything more than traversing the AST, shouldn't this be a clang-tidy check?
Also, i suspect CERT-MSC24-C might be relevant
lib/StaticAnalyzer/Checkers/CheckSecuritySyntaxOnly.cpp | ||
---|---|---|
806 | depr*e*cated |
It'd look good in clang-tidy (especially if extended to provide fixits), but if Daniel is interested in having this feature in the analyzer (and picked by clang-tidy from there), i wouldn't mind.
I wonder how noisy this check is - did you test it on large codebases? Because these functions are popular, and in many cases it'd be fine to use insecure functions, i wonder if it's worth it to have this check on by default. Like, if it's relatively quiet - it's fine, but if it'd constitute 90% of the analyzer's warnings on popular projects, that'd probably not be fine.
lib/StaticAnalyzer/Checkers/CheckSecuritySyntaxOnly.cpp | ||
---|---|---|
796–797 | Note that you cannot easily figure out if the code is intended to get compiled only under C11 and above - maybe it's accidentally compiled under C11 for this user, but is otherwise intended to keep working under older standards. | |
831 | Because it also checks deprecated buffer handling, i'd rename this function to checkDeprecatedOrUnsafeBufferHandling. | |
869–874 | You'd probably also want to quit early if the format string is not a literal. |
This patch basically extends an already existing static analyzer check. Even if tidy might be a better fit, I wonder what is the right thing to do in this case. We either end up overlapping functionality with the analyzer and tidy or have to come up with a policy what to do in this such cases.
lib/StaticAnalyzer/Checkers/CheckSecuritySyntaxOnly.cpp | ||
---|---|---|
796–797 | It is a possible scenario, how should I check if the checks should warn (safe functions are available) if not by using this method? | |
869–874 | If the format string is not a literal (i.e. a variable), currently we cannot determine if there were any restrictions regarding the size or not, so we want this check to warn. |
include/clang/StaticAnalyzer/Checkers/Checkers.td | ||
---|---|---|
420 | I do not like the naming of these two checks, It feels like one of them warns for a subset of the other, however, it is not the case. | |
lib/Driver/ToolChains/Clang.cpp | ||
2256 ↗ | (On Diff #117782) | Do we want to turn on both of these by default? Warning on every use of memset will be very noisy for example. |
lib/StaticAnalyzer/Checkers/CheckSecuritySyntaxOnly.cpp | ||
155 | This is the only print-related function in this group. Is this intended? | |
796–797 | This is the same approach some check takes in clang tidy. I think it is still better to not warn for non C11 users than warn for everybody. If a project is not interested in this check but they are interested in having C11 builds, they can turn this check off. (Or it can be off by default in the first place). |
include/clang/StaticAnalyzer/Checkers/Checkers.td | ||
---|---|---|
420 | Both checker warns if a buffer handling function is deprecated (DeprecatedOrUnsafeBufferHandling calls DeprecatedBufferHandling), but the DeprecatedOrUnsafeBufferHandling checker also warns if a function is not only deprecated but unsafe (i.e. writes a buffer without size restrictions) too. | |
lib/Driver/ToolChains/Clang.cpp | ||
2256 ↗ | (On Diff #117782) | It's true that these checkers could be very noisy if they are enabled by default, I'm going to fix this. |
lib/StaticAnalyzer/Checkers/CheckSecuritySyntaxOnly.cpp | ||
155 | sprintf and vsprintf are buffer-writing functions without size restrictions, this is the reason why only these two are checked by DeprecatedOrUnsafeBufferHandling of the printf family. There are buffer-writing members of this family which are not unsafe (because of size restrictions) but have a secure version introduced in the C11 standard, and DeprecatedBufferHandling warns for them: swprintf, snprintf, vswprintf, vsnprintf. Other printf-related functions are either printing to a file or to the standard output. |
include/clang/StaticAnalyzer/Checkers/Checkers.td | ||
---|---|---|
420 | I see. Maybe it would be better to make them disjoint? Also, I think it is not a good user experience to get two warnings for the same function call. |
include/clang/StaticAnalyzer/Checkers/Checkers.td | ||
---|---|---|
420 | Do you mean to separate them, so we would have one checker which warns for the unsafe and deprecated buffer handling functions, and one which warns for the deprecated functions which have some boundary restrictions? |
lib/StaticAnalyzer/Checkers/CheckSecuritySyntaxOnly.cpp | ||
---|---|---|
107 | 80 chars | |
176 | That's a lot of duplicated WalkAST::checkDeprecatedOrUnsafeBufferHandling. Could that be simplified? | |
810 | That's a lot of duplication of 1/0/-1. And also 1/0/-1 are cryptic symbols, why not use an enum with a descriptive name? |
lib/StaticAnalyzer/Checkers/CheckSecuritySyntaxOnly.cpp | ||
---|---|---|
810 | The duplications will be solved by using .Cases, but is using enum necessary? 1 and 0 refers to the index of the argument which could be a format string. -1 means there is no need to look for a string like this. |
lib/StaticAnalyzer/Checkers/CheckSecuritySyntaxOnly.cpp | ||
---|---|---|
810 | Right, then I guess 1 and 0 are self-descriptive, but I would still add a separate constant for -1. |
@koldaniel Have you evaluated this checker? On which codebases? Were the warnings real security issues, or were they mostly spurious? The code seems fine, but I'm not sure whether it should be in security or in alpha.
I've evaluated this checker on LLVM+Clang, there were only a few (about 15) warnings, because of the C11 flag check at the beginning of the checker body. However, if this check was removed, number of the warnings would be increased significantly. I wouldn't say the findings were real security issues, most of the warnings were about usages of deprecated functions, which has not been considered unsecure (but which may cause problems if the code is modified in an improper way in the future).
LGTM!
Any objections to commit this?
I think this is quiet coding guideline specific check which is useful for a set of security critical projects. As this is an opt in kind of check, I think it does no harm to have it upstream.
Overall I think this looks great, thanks! I left some inlines that would be nice to fix before commiting, but all of them are minor nits.
Would it be possible for you to commit the clang-formatting and the actual logic separately?
include/clang/StaticAnalyzer/Checkers/Checkers.td | ||
---|---|---|
418 | Lately we started paying attention to the 80 column limit in Checkers.td. Could you please break this line? | |
lib/StaticAnalyzer/Checkers/CheckSecuritySyntaxOnly.cpp | ||
736 | I think this newline should stay. | |
791 | This is out of place. | |
799 | Could you please add an extra newline? | |
811–818 | I wonder whether using CallDescription would be (way) more efficient here. Comparing IndentifierInfos would also be better I guess. I'm a little unsure about performance implications here, it might not be worth the chore to refactor this. | |
835–838 | Could you please start these variable names with a capital letter? | |
test/Analysis/security-syntax-checks.m | ||
274 | When using {{}}, you actually supply a regex as an argument, and the output of the analyzer is matched against it. My point is, could you instead just write // expected-warning{{Call to function 'sprintf' is insecure}} to improve readability? |
test/Analysis/security-syntax-checks.m | ||
---|---|---|
274 | Or whatever the shortest string is needed to know whether the expected output it there. |
Sorry, didn't read the discussion, there are some fair points in the quoted comments.
My problem is that LLVM+Clang isn't really a C (nor a C11) project, and I think judging this checker on it is a little misleading. Could you please test it on some C11 projects? I think tmux uses C11.
Edit: it doesn't, but CMake is mostly a C project and it does!
I do generally agree with this statement, but I'd be more comfortable either landing it in alpha or seeing some other results.
CMake really isn't a C project if you look at what language it actually uses - the C files come from tests and system utilities such as ZIP being reimplemented.
And as far as I've seen with my recent endeavours, Clang has problems with parsing and building CMake (however all these errors relate to some very niche standard library memory size related stuff not being there where it wants them).
What do we want to validate here? The lack of crashes? Or evaluate false positive ratio?
I have some doubts about evaluating this checker on open source projects. If a project does not care about the safe versions of these functions all of the results will be false positive (or a project might actually care but will not be able to comply due to portability constraints). If a project does care about using the safe variants, they are most likely already using another tool to verify this.
So I think the main value here is to subsume other tools.
To add an analogy, Clang Tidy will not require C++ Core Guidelines related checks to be evaluated on projects that are not following the guidelines as the results are meaningless for those projects.
Ok! I hope that the C11 check would do the trick, let's see how it goes :)
This checker is opt-in, but it still has a notion of a useless positive, and it is something that can often be improved upon. Crashes are also a great thing to look for :)
On the contrary, figuring out if the project follows C++ Core Guidelines is pretty hard. Well, it's often pretty easy to demonstrate that it doesn't (: But, i mean, the intention.
lib/StaticAnalyzer/Checkers/CheckSecuritySyntaxOnly.cpp | ||
---|---|---|
796–797 | Hmm, i guess in many cases such clients may suppress the warning through preprocessing. This will keep the checker warning in C11 builds but would also avoid miscompiles in builds for the older standards. |
Could you rebase please? Many things have changed since you last update. After that I'll happily commit on your behalf if you don't have a commit access just yet.
Hmmm, DescFile was removed months ago from Checkers.td, are you sure you uploaded the correct diff?
There seems to be a crash in this code. @koldaniel, would you like to take a look? https://bugs.llvm.org/show_bug.cgi?id=41185
Hi,
True, it is a faulty scenario, my question is what should be the way forward? I think in case of built-in functions there should be no warning, since they differ from the deprecated ones which come from the old standard. The only purpose of the assert was to help development and maintenance (if a new function had been added, it should be decided if it is deprecated or unsafe). Returning instead of asserting would solve the problem.
On many platforms such standard C functions are implemented as macros that expand to the respective builtins. I believe there should be no difference in behavior between the normal function and the builtin function. Cf. test/Analysis/bstring.c.
Lately we started paying attention to the 80 column limit in Checkers.td. Could you please break this line?