This is an archive of the discontinued LLVM Phabricator instance.

[analyzer] StdLibraryFunctionsChecker: Add sanity checks for constraints
ClosedPublic

Authored by martong on Apr 7 2020, 9:06 AM.

Details

Summary

Once we found a matching FunctionDecl for the given summary then we
validate the given constraints against that FunctionDecl. E.g. we
validate that a NotNull constraint is applied only on arguments that
have pointer types.

This is needed because when we matched the signature of the summary we
were working with incomplete function types, i.e. some intricate type
could have been marked as Irrelevant in the signature.

Diff Detail

Event Timeline

martong created this revision.Apr 7 2020, 9:06 AM
balazske added inline comments.Apr 8 2020, 2:33 AM
clang/lib/StaticAnalyzer/Checkers/StdLibraryFunctionsChecker.cpp
116

Is it possible to have the validate on the signature (not function decl)? Because the function decl (where the rule is applied) should match the signature already. For example, if the signature contains an irrelevant type it is not good to have a rule that expects a void* at that place. After these rules are validated with the signature, they should be applicable on a (any) function that matches the signature.

411

So this is a check to be used at debug build only to check for programming errors?
It could be better to separate this task from setting the FD itself. Or this function can be called setFD and contain a validation if in debug mode (or always). Probably it is better to do the validation always to filter out unexpected found functions (if some unusual API is encountered).

martong marked 2 inline comments as done.Apr 8 2020, 9:25 AM
martong added inline comments.
clang/lib/StaticAnalyzer/Checkers/StdLibraryFunctionsChecker.cpp
116

void* would not be a problem. However, there are more intricate pointer types like FILE *. To get that type from the ASTContext we'd have to do another lookup for "FILE" which would complicate things. So, rather we mark that as Irrelevant in the signature. That's what we do in case of fread for instance. And after lookup we will have the concrete type in our hands, and that is the moment where we can do all the validation.

Perhaps we could alternatively have different kind of Irrelevant types like IrrelevantPtr, etc. But we'd forget to enumerate all of them.

411

The goal here is to avoid inadvertently adding a summary to a wrong function. This is all because of irrelevant types. The summary as described in the form of a signature and with cases and argument constraints has several presumptions on the types. And the type may be not concrete (irrelevant) so we cannot check the presumptions. We can validate those presumptions once we have the full and complete type.

It could be better to separate this task from setting the FD itself.

Yes, you are right, I am going to separate that.

Probably it is better to do the validation always to filter out unexpected found functions

Yes, I agree. I am not sure about the assertions though. Right now the programmer writes the summaries (e.g. me). But once in the future, a library author could write summaries in JSON format. Perhaps a warning diagnostic would be more appropriate that time. Or we could have a checker option that logs out if a function was added to the summary map.

The idea is noble with the addition of validate functions, assert in debug builds and just move on in release. However, I'd expect it to be integrated into the signature matching function.

clang/lib/StaticAnalyzer/Checkers/StdLibraryFunctionsChecker.cpp
317–319

It might be worth putting a TODO here to not forget the constness methods :^)

800

This looks a bit odd, we're checking whether the function matches, and than we validate right after? Shouldn't we just not match the FD if it isn't valid?

martong marked 7 inline comments as done.May 4 2020, 9:56 AM
martong added inline comments.
clang/lib/StaticAnalyzer/Checkers/StdLibraryFunctionsChecker.cpp
317–319

Two types do not match if one of them is a const and the other is non-const. Is this what you're referring for?

800

Yeah, ok, I moved the validation back into matchesSignature.

martong updated this revision to Diff 261855.May 4 2020, 9:56 AM
martong marked 2 inline comments as done.
  • Better encapsulate the data in a 'Summary'
balazske added inline comments.May 6 2020, 12:56 AM
clang/lib/StaticAnalyzer/Checkers/StdLibraryFunctionsChecker.cpp
133

Function names should be verb phrases. Maybe checkSanity is good or validateInternal, validateMore (this function relates to validate, not something different from it as "sanity").

validate can be called isValid or checkValidity, so we can not think (from the name) that the function changes something on the object. (checkValidity is good, then sanityCheck can be renamed to checkSpecificValidity or like this.)

372–373

I would extend the documentation with information about that the signature and constraints together contain information about what functions are accepted by the summary. The signature can use "wildcards" (the irrelevant types) and the constraints may specify additional rules for that type (that comes from the kind of constraint).

416

Again isValid or checkValidity is a better name, or validateByConstraints, matchesConstraints. Additionally matchesSignature checks in its current form not only for signature, so its name is not correct too.

800

I think these are not related, there should be signature match and validness check, but it is natural to check first the signature, then the validness.

martong updated this revision to Diff 266879.May 28 2020, 7:42 AM
martong marked an inline comment as done.
  • Rebase to master
martong marked 7 inline comments as done.May 28 2020, 9:31 AM
martong added inline comments.
clang/lib/StaticAnalyzer/Checkers/StdLibraryFunctionsChecker.cpp
133

Thanks for the naming suggestions, they are definitely better than my naming!

372–373

OK, I added some more docs to the class.

416

Ok, renamed to validateByConstraints.

800

@Szelethus, @balazske : I agree with both of you so I renamed matchesSignature to matches, which is a shorthand to the the signature match and then the validity check (and added a comment):

// Returns true if the summary should be applied to the given function.
bool matches(const FunctionDecl *FD) const {
  return Sign.matches(FD) && validateByConstraints(FD);
}
martong updated this revision to Diff 266918.May 28 2020, 9:31 AM
martong marked 4 inline comments as done.
  • Rename: validate -> checkValidity, sanityCheck -> checkSpecificValidity
  • Rename: Summary::checkValidity -> validateByConstraints
  • Add more docs to the Summary class
  • Rename matchesSignature -> matches

The code looks basically good to me, some documentations can be improved.

clang/lib/StaticAnalyzer/Checkers/StdLibraryFunctionsChecker.cpp
66–67

This text above is not the documentation of Summary (is it attached to Summary by doxygen?). Probably not /// is needed, only //. And it is probably out of date now, at least I can not understand immediately what is it about (what typedef's are these, what kind of "nesting types").

116

We check here that a function is a suitable candidate to be used with the constraint? (We select a function for the constraint, not a constraint for a function.) Maybe a better description would help to clarify this.

643

It may be better to see type of VRS instead of auto (and know what the VRS abbrevation means, why not CC for case constraint and not VC for ValueConstraint). Same for VR below.

800

Suggestion: Maybe we can have one function for matches and setFunctionDecl (set it if matches). We do not want to change the function decl later anyway, and not to something that does not match.

martong updated this revision to Diff 267177.May 29 2020, 4:31 AM
martong marked 8 inline comments as done.
  • Add comments, remove auto from 'for' loops, matches -> matchesAndSet
balazske accepted this revision.May 29 2020, 5:33 AM
This revision is now accepted and ready to land.May 29 2020, 5:33 AM
This revision was automatically updated to reflect the committed changes.

arc adds many unneeded tags from Phabricator. You can drop Reviewers: Subscribers: Tags: and the text Summary: with the following script:

arcfilter () {
  arc amend
  git log -1 --pretty=%B | awk '/Reviewers:|Subscribers:/{p=1} /Reviewed By:|Differential Revision:/{p=0} !p && !/^Summary:$/ {sub(/^Summary: /,"");print}' | git commit --amend --date=now -F -
}

Reviewed By: is considered important by some people (https://lists.llvm.org/pipermail/llvm-dev/2020-January/137889.html). You should keep the tag. (I started to use --date=now because some people find author date != committer date annoying. The committer date is usually what people care.))

arc adds many unneeded tags from Phabricator. You can drop Reviewers: Subscribers: Tags: and the text Summary: with the following script:

arcfilter () {
  arc amend
  git log -1 --pretty=%B | awk '/Reviewers:|Subscribers:/{p=1} /Reviewed By:|Differential Revision:/{p=0} !p && !/^Summary:$/ {sub(/^Summary: /,"");print}' | git commit --amend --date=now -F -
}

Reviewed By: is considered important by some people (https://lists.llvm.org/pipermail/llvm-dev/2020-January/137889.html). You should keep the tag. (I started to use --date=now because some people find author date != committer date annoying. The committer date is usually what people care.))

Thanks for pointing this out and for the script! I'll try to omit unnecessary tags from arc in the future.

clang/lib/StaticAnalyzer/Checkers/StdLibraryFunctionsChecker.cpp
66–67

Ok, I removed this outdated comment.

116

We check here, whether the constraint is malformed or not. It is malformed if the specified argument has a mismatch with the given FunctionDecl (e.g. the arg number is out-of-range of the function's arguments list).

Added a comment in the code about that.

643

Indeed, I agree. Removed auto and refreshed the outdated variable names.

800

Alright, I renamed matches to matchesAndSet and we set the FD in this function now. Also setFunctionDecl is removed.