This is an archive of the discontinued LLVM Phabricator instance.

[analyzer][NFC] Introduce CallDescription::matches() in addition to isCalled()
ClosedPublic

Authored by steakhal on Nov 10 2021, 10:45 AM.

Details

Summary

This patch introduces CallDescription::matches() member function,
accepting a CallEvent.
Semantically, Call.isCalled(CD) is the same as CD.matches(Call).

The patch also introduces the matchesAny() variadic free function template.
It accepts a CallEvent and at least one CallDescription to match
against.

Diff Detail

Event Timeline

steakhal created this revision.Nov 10 2021, 10:45 AM
steakhal updated this revision to Diff 386849.Nov 12 2021, 7:42 AM

use free-function instead of a CallDescription static one to enable terser syntax for calling this by using ADL.
Also, add Doxygen group comment.

I like the idea of introducing variadic versions of the matching. But what is the motivation behind introducing new methods that are discouraged to use? I think the motivation behind these changes should be included in the description and the commit message.

clang/include/clang/StaticAnalyzer/Core/PathSensitive/CallDescription.h
100

I am confused. When do we use snake_case and when do we use likeThis?

109

What is the C++ standard used in LLVM nowadays? If C++17 or above, couldn't you simplify this using fold expressions?

I like the idea of introducing variadic versions of the matching.

They existed earlier but they were actually never used for some reason.

But what is the motivation behind introducing new methods that are discouraged to use?

CallDescriptions are meant to be a value-type groupping/holding all data necessary to 'consume' a function call event.
They are not designed to do caching but to blindly do a matching to see if the given call event is interesting or not.
Matching a CallDescription can be quite expensive, even more so in the future when I will propose introducing AST-matchers
for matching the types of the functions to let us select the right overload.
By doing so, we could read config files and populate CallDescriptionMaps at runtime, opening further configuration possibilities.

In contrast to this, CallDescriptionMaps are designed so that they are supposed to use some sort of caching mechanism.
In theory, we should try hard to reduce the matching complexity to O(1), since this operation is heavily used by the checkers.
One way to achieve this is to associate a function decl pointer to a CallDescription and compare only the call event's function decl against the cached one, which would achieve our O(1) goal.
At least if the call description is supposed to handle only a specific overload, which I presume dominates the rest of the use-cases, but I haven't checked this assumption yet, nor implemented any of these.

I think the motivation behind these changes should be included in the description and the commit message.

Sorry about that, I'll add it to the head commit of the stack.

NoQ added a comment.Nov 16 2021, 6:10 PM

In theory, we should try hard to reduce the matching complexity to O(1), since this operation is heavily used by the checkers.

It is but I don't recall this ever being an actual problem. These maps are usually tiny and function calls are so expensive on their own that such iteration becomes negligible. But generally, yeah, one of the reasons we have CallDescriptionMap is so that we didn't have to copy-paste such optimization from checker to checker.

clang/include/clang/StaticAnalyzer/Core/PathSensitive/CallDescription.h
108

Ts is always CallDescription, right? I'm not good with variadic templates but it sounds like we should be able to specify this.

109

I think we're still on C++14.

ASDenysPetrov added inline comments.Nov 17 2021, 7:09 AM
clang/include/clang/StaticAnalyzer/Core/PathSensitive/CallDescription.h
100

According to the coding standards we use snake_case for STL-like functions, but I couldn't find matches and matches_any among them.

109

What is the C++ standard used in LLVM nowadays?

As we can see, it is C++14. So fold expressions are not the case unfortunately.

martong added inline comments.Nov 18 2021, 8:43 AM
clang/include/clang/StaticAnalyzer/Core/PathSensitive/CallDescription.h
87
95–97

Instead of this comment, let's enforce what you mean by making these member functions private and making CallDescriptionMap a friend!

(And by doing that you have to update the summary of the patch as well to reflect the changes ...)

martong added inline comments.Nov 18 2021, 9:19 AM
clang/include/clang/StaticAnalyzer/Core/PathSensitive/CallDescription.h
95

Actually you are talking about just one function aren't you?
bool matches(const CallEvent &Call) const

The friend functions below are part of the public API we provide for the checkers looking at the next patch:

bool BlockInCriticalSectionChecker::isBlockingFunction(const CallEvent &Call) const {
  return matches_any(Call, SleepFn, GetcFn, FgetsFn, ReadFn, RecvFn);
steakhal updated this revision to Diff 388428.Nov 19 2021, 2:12 AM
steakhal marked 8 inline comments as done.
steakhal retitled this revision from [analyzer][NFC] Introduce CallDescription::matches() addition to isCalled() to [analyzer][NFC] Introduce CallDescription::matches() in addition to isCalled().
steakhal edited the summary of this revision. (Show Details)
  • matches_any() -> matchesAny()
  • fixed some typos
  • removed the discouraging comment reminding users not using CallDescriptions directly
steakhal added inline comments.Nov 19 2021, 2:13 AM
clang/include/clang/StaticAnalyzer/Core/PathSensitive/CallDescription.h
95

I actually meant the whole overload set in the broadest term, including matches and matches_any as well.
But I decided to remove this comment since it doesn't make much sense to enforce a specific usage/pattern unless we have a compelling reason to do so. Since I cannot see (yet) how could we implement caching in an efficient way I would rather push these to the future.

95–97

I'm not sure about that, I would rather remove the note altogether. See my previous comment why.

108

AFAIK this is the best we can do, unless of course we want heavy TMP stuff achieving the same thing at the end of the day.
If you don't pass a CallDescription, the recursion won't match with the base-case, thus we get a compile error. However, I think we shouldn't really worry about usercode misusing this API.

109

I miss them so much! Although we have to comply with c++14.
https://github.com/llvm/llvm-project/blob/main/llvm/CMakeLists.txt#L48-L49

martong accepted this revision.Nov 19 2021, 2:54 AM

Okay, LGTM!

This revision is now accepted and ready to land.Nov 19 2021, 2:54 AM
This revision was landed with ongoing or failed builds.Nov 19 2021, 9:33 AM
This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptNov 19 2021, 9:33 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript