This is an archive of the discontinued LLVM Phabricator instance.

[WIP][Analyzer] PtrToIntegCastLibcChecker
Needs ReviewPublic

Authored by jkorous on Sep 25 2020, 12:43 PM.

Details

Summary

Checker for memory address being passed to libc functions via parameters for which such value doesn't make sense.

I have a bunch more functions that I was considering but I feel that those might be used in some funky pointer arithmetics - e. g. Artem mentioned pointer hashing.

arg 0
   cos
   sin
   tan
   acos
   asin
   atan
   atan2
   cosh
   sinh
   tanh
   acosh 
   asinh 
   atanh 
   exp
   frexp
   log
   log10
   modf
   exp2 
   expm1 
   ilogb 
   log1p 
   log2 
   logb 
   pow
   sqrt
   cbrt
   erf
   erfc
   tgmamma
   lgamma

 arg 0, 1
   ldexp
   scalbn
   scalbln
   hypot
   remainder
   remquo
   copysign

Diff Detail

Event Timeline

jkorous created this revision.Sep 25 2020, 12:43 PM
jkorous requested review of this revision.Sep 25 2020, 12:43 PM
jkorous added inline comments.Sep 25 2020, 4:19 PM
clang/lib/StaticAnalyzer/Checkers/PointerToIntegralTypeCast/PtrToIntegCastLibcChecker.cpp
115

Actually, maybe this is too opinionated. Maybe there is a use-case for writing a single char via putc function family?

NoQ added a comment.Sep 27 2020, 11:23 PM

Hmm, if we could add an interface to CallDescriptionMap to accept AnyCall instead of CallEvent then we could use it for path-insensitive checkers as well. That'd have saved some lines of code.

I also wouldn't mind taking the opposite approach: assume by default that all standard library functions should be warned about, and then add exceptions as needed.

Also what's your excuse for using visitors instead of ASTMatchers this time? :) I think even a simple matcher callExpr().bind("result") that delegates all the work to the imperative match-callback may make the code a lot more concise (by the virtue of ditching the visitor). Argument type checks may look pretty nice in matcher form as well.

NoQ added inline comments.Sep 27 2020, 11:36 PM
clang/include/clang/StaticAnalyzer/Checkers/Checkers.td
25

core is a good package because this checker does in fact affect everybody but i don't want to put the checker here because we have a rule that checkers in core cannot be disabled. For that reason it has no path-insensitive checkers (because they don't interact with each other and therefore there's no technical reason to prevent the users from disabling them).

It looks like unix is the traditional place to put warnings about C standard library functions (eg., MallocChecker and StreamChecker belong to unix). It probably shouldn't be this way. I wish we had a better package for C standard library related checkers. Name suggestions anybody? I guess libc is a nice name. c is too short which might cause it to interact badly with our dreams about package-granular generic warning suppression mechanisms.

Our package system is generally pretty messy and prone to legacy / backwards compatibility concerns.

clang/lib/StaticAnalyzer/Checkers/PointerToIntegralTypeCast/PtrToIntegCastLibcChecker.cpp
115

This checker is by design a bug-prone / red flag pattern detector; it doesn't promise to find 100% bugs. It has a natural suppression ("put the casted pointer into a variable"). I wouldn't mind being warning liberally until we see actual intentional use-cases in practice.

clang/test/Analysis/Checkers/PointerToIntegralTypeCast/PtrToIntegCastLibcChecker.c
11

Wait, does this actually work? I thought you have to use expected-warning-re and an extra pair of double braces to get regexps working, otherwise it looks for a literal .*.

Szelethus added inline comments.Sep 28 2020, 12:09 AM
clang/include/clang/StaticAnalyzer/Checkers/Checkers.td
25

libc sounds nice! Hopefully we can migrate to a better labeling system sooner rather than later -- its a plus that everyone agrees that we should, and I personally don't foresee any large obstacles in the way of that.

@martong has been working hard on libc support for the analyzer, I'll add him :)

In D88332#2297374, @NoQ wrote:

Also what's your excuse for using visitors instead of ASTMatchers this time? :) I think even a simple matcher callExpr().bind("result") that delegates all the work to the imperative match-callback may make the code a lot more concise (by the virtue of ditching the visitor). Argument type checks may look pretty nice in matcher form as well.

+1

I would even go further. Why Clang Static Analyzer? I see nothing path-sensitive here. IMHO a better place to implement this functionality is Clang-Tidy.

clang/test/Analysis/Checkers/PointerToIntegralTypeCast/PtrToIntegCastLibcChecker.c
11

Whether it works or not, we never do that actually. Please write out the warning we should get here.

jkorous marked 3 inline comments as done.Sep 28 2020, 11:43 PM
In D88332#2297374, @NoQ wrote:

Hmm, if we could add an interface to CallDescriptionMap to accept AnyCall instead of CallEvent then we could use it for path-insensitive checkers as well. That'd have saved some lines of code.

Sounds interesting but I lack context to imagine the details. Would the map be still initialized by some ASTVisitor? Do you think there'd be any noticeable performance hit?

I also wouldn't mind taking the opposite approach: assume by default that all standard library functions should be warned about, and then add exceptions as needed.

How about I add the functions I mentioned above to alpha variation of this checker?

Also what's your excuse for using visitors instead of ASTMatchers this time? :) I think even a simple matcher callExpr().bind("result") that delegates all the work to the imperative match-callback may make the code a lot more concise (by the virtue of ditching the visitor). Argument type checks may look pretty nice in matcher form as well.

Busted... I just reached for the tool I am used to. I think you're right - let me use a matcher here.

clang/lib/StaticAnalyzer/Checkers/PointerToIntegralTypeCast/PtrToIntegCastLibcChecker.cpp
115

Ok.

FWIW I was intentionally pretty conservative to not deviate too much from the general expectation I assume people have of the analyzer - not too much false positives.
Here, putc is probably fine but otherwise I won't feel comfortable producing too much noise - again it's just my assumption but I feel that once someone decides that a particular checker produces too much false positives (esp. at callsites) we lose that user for the foreseable future.

WDYT? Am I overthinking it?

clang/test/Analysis/Checkers/PointerToIntegralTypeCast/PtrToIntegCastLibcChecker.c
11

For sure! Sorry, I never intended to land it as it is (this definitely falls under the "WIP"). I just somewhat expected that name/package might change and given that expectation - I was "lazy" and just added the TODO below.

@NoQ You're right - this doesn't work. It's just a half-baked draft of tests based on my ad-hoc test file.

Since you are here - I was actually more wondering about what test coverage should I aim for. Is a somewhat random trio of libc functions with somewhat diverse parameters good enough here?

jkorous marked an inline comment as done.Sep 29 2020, 12:01 AM

I would even go further. Why Clang Static Analyzer? I see nothing path-sensitive here. IMHO a better place to implement this functionality is Clang-Tidy.

This concern has been brought up before. I hope you won't mind me pasting @NoQ 's response as it nicely summarizes the situation.

... Also these tools are not a drop-in replacement of each other: you cannot use clang-tidy in all the places where you can use the static analyzer (the opposite is partially true due to libStaticAnalyzer integration into clang-tidy but UI degrades dramatically when such integration is used) therefore many authors simply do not have a choice where to put the check (i'm slowly working on improving this situation so that most path-insensitive checks could indeed be migrated into clang-tidy but we're not there yet). So as of today we have no choice but to let our contributors decide freely where they want their checks to be.

NoQ added a comment.Oct 12 2020, 2:46 PM
In D88332#2297374, @NoQ wrote:

Hmm, if we could add an interface to CallDescriptionMap to accept AnyCall instead of CallEvent then we could use it for path-insensitive checkers as well. That'd have saved some lines of code.

Sounds interesting but I lack context to imagine the details. Would the map be still initialized by some ASTVisitor?

Nah, it's brace-initialized with names as usual.

Do you think there'd be any noticeable performance hit?

Your code is already linear. As of now CallDescriptionMap is linear too, but if we are to make any improvements to that, i'd rather do them once in CallDescriptionMap than duplicate them in every checker. The problem of identifying functions in the AST is so ridiculously common that we absolutely have to have a unified solution to it. It really hurts when 100 checkers are making the same mistakes over and over again (not accounting for builtins, not accounting for inline namespaces, crashing on accessing a non-existing argument of a call, etc.).

clang/lib/StaticAnalyzer/Checkers/PointerToIntegralTypeCast/PtrToIntegCastLibcChecker.cpp
115

I mean, like, as part of our testing we should take a look at the warnings and see how bad they are when they're wrong and use best judgement (:

For putc specifically i wouldn't be shocked that some code would actively use that to print pointers. But i'd be shocked if that's actually a lot of warnings on a single project, rather than "that one lonely function in which it actually makes sense".

The opposite thing we could do would be to only warn on functions that do inherently introduce some source of confusion. Say, you could mix up the two arguments in strncpy() and accidentally swap the pointer and the size (this partially overlaps with clang-tidy's bugprone-swapped-arguments check).