This is an archive of the discontinued LLVM Phabricator instance.

[analyzer] StdLibraryFunctionsChecker: Add support to lookup types
ClosedPublic

Authored by martong on May 15 2020, 8:42 AM.

Details

Summary

In this patch I am trying to get rid of the Irrelevant types from the
signatures of the functions from the standard C library. For that I've
introduced lookupType() to be able to lookup arbitrary types in the global
scope. This makes it possible to define the signatures precisely.

Note 1) fread's signature is now fixed to have the proper FILE *restrict
type when C99 is the language.
Note 2) There are still existing Irrelevant types, but they are all from
POSIX. I am planning to address those together with the missing POSIX functions
(in D79433).

Diff Detail

Event Timeline

martong created this revision.May 15 2020, 8:42 AM
balazske added inline comments.May 18 2020, 4:05 AM
clang/lib/StaticAnalyzer/Checkers/StdLibraryFunctionsChecker.cpp
747

The Optional<QualType> can be left out from the right side expressions (in the ? operator part). (A QualType should be assignable to Optional<QualType>.)

martong marked 2 inline comments as done.May 26 2020, 9:05 AM
martong added inline comments.
clang/lib/StaticAnalyzer/Checkers/StdLibraryFunctionsChecker.cpp
747

No that cannot be left out. Implicit conversion does not play when trying to determine the common type for the 2nd and 3rd argument of the ternary op. https://stackoverflow.com/questions/32251419/c-ternary-operator-conditional-operator-and-its-implicit-type-conversion-r
So, removing the explicit type I get a compile error.

balazske added inline comments.May 27 2020, 1:21 AM
clang/lib/StaticAnalyzer/Checkers/StdLibraryFunctionsChecker.cpp
747

Still I do not like that code. Probably this is better then:

Optional<QualType> FilePtrTy, FilePtrRestrictTy;
if (FileTy) {
  FilePtrTy = ACtx.getPointerType(*FileTy);
  FilePtrRestrictTy = ACtx.getLangOpts().C99
                          ? ACtx.getRestrictType(*FilePtrTy) // FILE *restrict
                          : *FilePtrTy)
}
martong updated this revision to Diff 266834.May 28 2020, 6:07 AM
martong marked an inline comment as done.
  • Rebase to master
martong updated this revision to Diff 266858.May 28 2020, 6:55 AM
martong marked 2 inline comments as done.
  • Add if (FileTy)
clang/lib/StaticAnalyzer/Checkers/StdLibraryFunctionsChecker.cpp
747

Ok, I added the if.

balazske accepted this revision.May 29 2020, 8:00 AM
balazske added inline comments.
clang/lib/StaticAnalyzer/Checkers/StdLibraryFunctionsChecker.cpp
739

There is a ASTContext::getFILEType that can be used for this. (But if more types are needed the lookupType must be used again.)

This revision is now accepted and ready to land.May 29 2020, 8:00 AM
martong marked 2 inline comments as done.May 29 2020, 8:41 AM
martong added inline comments.
clang/lib/StaticAnalyzer/Checkers/StdLibraryFunctionsChecker.cpp
739

Yeah, didn't realize that we have getFILEType until now. It's unfortunate that I added the tests for FILE in this sense.

getFILEType can return a null QualType, which unfortunately can be mixed with Irrelevant. I'd like to avoid that. Still, we could initialize the Optional<QualType> with the help of getFILEType. Maybe in a later patch it would be worth to do that (and then refactor the lookup.c[pp] tests too).

On the other hand, since we are going to use lookupType extensively with other types, I don't see much benefit to make an exemption to FILE.

A high level comment.

Trying to match function signatures is not a unique problem! In fact, almost all of the checks the analyzer have is trying to solve the very some problem.
One of the methods we have at this point is called CallDescription, see here: https://github.com/llvm/llvm-project/blob/master/clang/lib/StaticAnalyzer/Core/CallEvent.cpp#L358

Moreover, I would assume something similar needs to be done for APINotes.

Do you think it would be possible to extend the CallDescription interface to match your needs? In that case all of the checks could profit from this work.
What do you think?

This revision was automatically updated to reflect the committed changes.
martong marked an inline comment as done.

A high level comment.

Trying to match function signatures is not a unique problem! In fact, almost all of the checks the analyzer have is trying to solve the very some problem.
One of the methods we have at this point is called CallDescription, see here: https://github.com/llvm/llvm-project/blob/master/clang/lib/StaticAnalyzer/Core/CallEvent.cpp#L358

Moreover, I would assume something similar needs to be done for APINotes.

Do you think it would be possible to extend the CallDescription interface to match your needs? In that case all of the checks could profit from this work.
What do you think?

Yes, viewing from this angle, I am trying to solve a problem here that perhaps could be handled by CallDescription and CallDescriptionMap with some extensions.

Here are the differences between the two approaches so far:

  • CallDescription is evaluated for every call event (e.g. checkPreCall), the names - strings - are compared; containing declaration contexts are compared by names (see CallEvent::isCalled). Contrary to this, summaries are associated directly to FunctionDecls, so during a call event we compare two FunctionDecl pointers.
  • A CallDescription does not support overloading on different types if the number of parameters are the same. With summaries this works.
  • A CallDescriptionMap is a static map with fixed number of entries. Contrary to this, the FunctionSummaryMap contains an entry (a summary) for a function only if we can lookup a matching FunctionDecl in the TU. The initialization of the FunctionSummaryMap happens lazily during the first call event.

Except the lack of proper overloading support, these differences are in the implementation. So, yes, giving it more thought, maybe we could refactor CallDescriptionMap to behave this way, but that would be some very heavy refactoring :) Still, would be possible, I guess.

Moreover, I would assume something similar needs to be done for APINotes.

Yes, I agree, I could not find out (yet) how the do it, but somehow they must match a given FunctionDecl to the Name in the YAML.


I was thinking about an alternative method for a long time now.
Making one step backwards: what we need is to be able to associate some data to a given function. And we can perfectly identify the function with it's prototype written in C/C++.
So, what if we'd be able to write a CallDescriptionMap (or a FunctionSummaryMap) like this:

CallDescriptionMap<FnCheck> Callbacks = {
    {{"void *memcpy(void *dest, const void *src, size_t n)"}, &CStringChecker::evalMemcpy},
    ...

Actually, we could reuse the Parser and the Sema itself to solve this issue. In fact, we could achieve this by reusing the ExternalASTSource together with the ASTImporter to find precisely the existing redecl chain in the TU for memcpy.