This is an archive of the discontinued LLVM Phabricator instance.

[Analyzer][WebKit] RefCntblBaseVirtualDtorChecker & shared utils
ClosedPublic

Authored by jkorous on Mar 31 2020, 2:41 PM.

Details

Summary

I'm starting to put up patches for WebKit checkers in analyzer. I have the first batch of 3 checkers pretty much ready and would like to get some feedback before I proceed with the rest.

I also have a bunch of questions.

  • ASTMatchers
    • I was considering rewriting the visitors to matchers but since I haven't really used I don't have intuition for them. From other uses it seems to me I won't be able to express all the logic currently implemented in shouldSkipFoo methods (including next patches too). Would having only part of that logic expressed as matcher still be a win?
  • documentation
    • What does Documentation<HasDocumentation> in Checkers.tddo?
    • Also, the checkers for WebKit are designed as a system of rules and although there are no implementation dependencies between individual checkers they make sense only as a whole. What would be a good place for description of the rules? Just a README in clang/lib/StaticAnalyzer/Checkers/WebKit or somewhere in clang/docs/analyzer?
  • tests organization
    • Seems like the convention is one file per checker. Correct?

Diff Detail

Event Timeline

jkorous created this revision.Mar 31 2020, 2:41 PM
jkorous updated this revision to Diff 254012.Mar 31 2020, 2:57 PM

Fix typo in doc comment

NoQ added a comment.Mar 31 2020, 6:32 PM

ASTMatchers

They're very flexible. You can express arbitrary parts of the logic with them, you can combine and re-use the combinations by putting them into variables or producing via factory functions. On top of that, it's ridiculously easy to define your own completely custom matchers if you're interested in, say, making queries about a specific property of the node that nobody queried before with matchers; such custom matchers may be either made public or remain local to your checker.

Your code looks very matcher-ish. I wouldn't be surprised if you can express it with a few matchers with minimal customizations.

The "skip" part is done via unless() matcher. Base class traversal is done by isDerivedFrom() matchers that accept arbitrary sub-matchers for the base class. Etc.

documentation

https://clang-analyzer.llvm.org/available_checks.html is where it should ultimately appear. In the monorepo it's in clang/www/analyzer in plain html. I think Documentation<HasDocumentation> has no formal meaning for now, just a markup for easier figuring out if we need one; @Szelethus am i right?

Seems like the convention is one file per checker. Correct?

Usually yes, but very situational. Say, if the checker checks global operator new, you clearly can't have all tests in one file. Some checkers work for different languages so again more files. Some tests test interactions between different checkers. Some tests are just huge so people decided to put them in different files.

clang/include/clang/StaticAnalyzer/Checkers/Checkers.td
1511

"Alpha" is an indication that you want to do more work on these checkers before enabling them by default. What work do you foresee?

clang/lib/StaticAnalyzer/Checkers/WebKit/ASTUtils.h
27–28

A code or AST dump example would have been super helpful. I don't understand what "Origin" means. Is this an lvalue that the expression loads from?

clang/lib/StaticAnalyzer/Checkers/WebKit/RefCntblBaseVirtualDtorChecker.cpp
23

Why not directly check::ASTDecl<CXXRecordDecl>? (I don't remember what exactly does it do).

25–26

These days we don't do lazy initialization anymore, just BugType Bug{"Description", "Category"}.

137

Unlike clang frontend, we usually encourage our warnings and notes to sound like full sentences; there's no need to shorten or abbreviate things in order to fit into 80 characters of a terminal or something because we're by design a GUI tool, so understandability comes first.

138

WebKit isn't a category of bugs, it's a browser engine :) I suggest something like "WebKit coding guidelines" or something along those lines.

clang/test/Analysis/Checkers/WebKit/ref-cntbl-base-virtual-dtor-templates.cpp
10

Do you ultimately want to have notes as well? Like, point the user to RefCntblBase or something? 'Cause we do have the ability to emit path-insensitive notes but they're only implemented in scan-build at this point.

martong added inline comments.Apr 1 2020, 7:38 AM
clang/lib/StaticAnalyzer/Checkers/WebKit/RefCntblBaseVirtualDtorChecker.cpp
22

This checker does not seem to benefit anything from path sensitivity. Would it be more appropriate as a clang-tidy (with ASTMatchers) checker?

dcoughlin added inline comments.
clang/lib/StaticAnalyzer/Checkers/WebKit/RefCntblBaseVirtualDtorChecker.cpp
22

We have many AST-based checks in the Clang Static Analyzer (some of which use ASTMatchers). I'd like to have this in the static analyzer so that static analyzer users can benefit from it.

NoQ added inline comments.Apr 7 2020, 2:05 PM
clang/lib/StaticAnalyzer/Checkers/WebKit/RefCntblBaseVirtualDtorChecker.cpp
22

Yeah, for now the answer to such questions is "whatever is the tool that the author uses", because the engine isn't the only thing that's different. Ideally when i finish my work on providing different output modes to clang-tidy and we also somehow allow transparently running clang-tidy on top of an existing static analyzer integration, we may move all of our AST-based checks to clang-tidy and finally solve this dilemma :)

I'll attend to your patch when I find the time!

In D77177#1953895, @NoQ wrote:

documentation

https://clang-analyzer.llvm.org/available_checks.html is where it should ultimately appear. In the monorepo it's in clang/www/analyzer in plain html.

I disagree, I think we should focus on the official documentation (https://clang.llvm.org/docs/analyzer/checkers.html), and try to keep the plain html coding to the minimum.

I think Documentation<HasDocumentation> has no formal meaning for now, just a markup for easier figuring out if we need one; @Szelethus am i right?

This was actually added by Aaron in D55792.

The static analysis checkers are mostly listed on two pages, one for alpha checks and one for released checks (there are also a considerable number of undocumented analyzer checkers). This patch adds anchors to all of the documented checks so that you can directly link to a check by a stable name. This is useful because the SARIF file format has a field for specifying a URI to documentation for a rule and some viewers, like CodeSonar, make use of this information.

This patch exposes those anchors through the SARIF exporter, which requires an automated way to determine the check URIs. Checkers.td now requires each checker to be annotated with an extensible list of documentation kinds (listed in CheckerBase.td) that map to physical URI paths on the static analyzer web page if the check has documentation (mapping is done by the table generator), which is exposed via the CHECKER macro. When exporting to SARIF, the checker name itself is used as the anchor target on the page (and I verified that checker names, including dots and dashes, will be valid anchors) and the corresponding edits were made to each HTML file to add the anchors.

The generated links still point to clang-analyzer.llvm.org instead of the official clang documentation, we should probably change that soon.

jkorous planned changes to this revision.Apr 8 2020, 1:38 PM

Sorry - I should have made this clear earlier so nobody waste their time reviewing current version - I'm rewriting this with ASTMatchers.

jkorous marked 10 inline comments as done.Apr 22 2020, 4:40 PM
jkorous added inline comments.
clang/include/clang/StaticAnalyzer/Checkers/Checkers.td
1511

I intended to keep all the checks initially as alpha but maybe it's unnecessary.

clang/lib/StaticAnalyzer/Checkers/WebKit/ASTUtils.h
27–28

I don't know what the best word would be here but the "origin" or "source" is an AST node which ultimately provides the memory address used in Expr E.

Some examples where we successfully find the origin:

void foo(RefCountable* ptr);

foo(nullptr); // origin of ptr is nullptr

RefCountable* local_ptr;
foo( *&local_ptr ); /// origin of ptr is local_ptr

void bar(RefPtr<RefCountable> refcntd_ptr_param) {
   foo( &makeRef(refcntd_ptr_param.get()).get() ); /// origin of ptr is refcntd_ptr_param
}

Some examples where we don't find the origin:

void foo(RefCountable* ptr);
RefCountable* provide_ref_cntbl();
RefCountable* somehow_combine_ptrs(RefCountable*, RefCountable*);

foo(provide_ref_cntbl()); // tryToFindPtrOrigin would return the CallExpr

foo( makeRefPtr(provide_ref_cntbl()).get() ); // tryToFindPtrOrigin would return the CallExpr to provide_ref_cntbl()

foo( makeRefPtr( somehow_combine_ptrs(provide_ref_cntbl(), provide_ref_cntbl()) ); // tryToFindPtrOrigin would return the CallExpr to somehow_combine_ptrs()

I think that ultimately the right place would be the documentation and I'd add a link here. Also, I should add tests and that might help with understanding too. WDYT?

clang/lib/StaticAnalyzer/Checkers/WebKit/RefCntblBaseVirtualDtorChecker.cpp
137

I'll fix that.
BTW why is the dot at the end stripped away? I was a bit surprised but then I found out that it's intentional :)

clang/test/Analysis/Checkers/WebKit/ref-cntbl-base-virtual-dtor-templates.cpp
10

I wasn't thinking about creating notes. I guess with good enough documentation that'd explain the concern we might be fine?

jkorous marked 2 inline comments as done.Apr 22 2020, 7:47 PM
jkorous added inline comments.
clang/lib/StaticAnalyzer/Checkers/WebKit/RefCntblBaseVirtualDtorChecker.cpp
23

Oh! You're right! I just copied this from some other checker.

With check::ASTDecl<CXXRecordDecl> I can nuke the whole ugly LocalVistitor!

But... Do you happen to know how can I visit template instantiations? In the ASTVisitor it's easy - I just override shouldVisitTemplateInstantiations().

Since I see this node in the AST in my tests:

ClassTemplateSpecializationDecl 0x7fa6010a3538 <line:25:1, line:26:32> col:8 struct DerivedClassTmpl3 definition

I tried:

class RefCntblBaseVirtualDtorChecker
    : public Checker<
      check::ASTDecl<CXXRecordDecl>,
      check::ASTDecl<ClassTemplateSpecializationDecl>
    > {
//...

  void checkClassTemplateSpecializationDecl(
// ...
  ) const {
    checkASTDecl(
      dyn_cast<ClassTemplateSpecializationDecl>(CTSD),
      MGR, BRArg);
  }

but seems like that's not it. I should probably think more about what's the relation between instantiations and specializations here.

I'd prefer not to copy the logic from RecursiveASTVisitor.h if at all possible.

jkorous requested review of this revision.Apr 22 2020, 7:52 PM

Not all comments have been addressed yet but we can continue the conversation.

jkorous marked an inline comment as done.Apr 23 2020, 4:25 PM
jkorous added inline comments.
clang/lib/StaticAnalyzer/Checkers/WebKit/RefCntblBaseVirtualDtorChecker.cpp
23

I think I kind of understand how this works now.

IIUC for check::ASTDecl-based checkers the visitor that invokes them is AnalysisConsumer. The thing that I don't see is a hypothetical switch that I could flip from within checker's implementation that would cause the AnalysisConsumer to visit template instantiations.

I tried hardcoding this in the AnalysisConsumer implementation by adding the following line to AnalysisConsumer.cpp and it worked like a charm.

bool shouldVisitTemplateInstantiations() const { return true; }

Now, I could imagine that for visitor which is the foundation for all AST-based checkers you might want to do this but 1. my context is infinitesimal 2. it's a positively scary thing to do now that checkers might rely on current behavior.

So, the approach with check::ASTDecl<TranslationUnitDecl> and defining my own visitor where I can locally tweak its behavior as I please can be seen as a workaround.

Since I actually copied the approach from PaddingChecker it seems that occasionally people run into this.

jkorous updated this revision to Diff 259994.Apr 24 2020, 2:10 PM
jkorous marked 4 inline comments as done.

Addressed the comments

NoQ added a comment.May 12 2020, 12:39 PM

Thanks for the investigations! My only remaining concern is about tryToFindPtrOrigin(). I suspect it's not used as part of this patch; maybe we can land it in a later patch in which i can see how it's used?

clang/lib/StaticAnalyzer/Checkers/WebKit/ASTUtils.h
27–28

I don't know what the best word would be here

Mm, can you describe it with multiple words? Because i'm completely confused at the moment. Like, i get the rough idea but i don't know how to apply it outside of the examples that you've already provided. For instance, both expressions ****p and a->b->c->d->e mention ~5 completely unrelated memory addresses; which one of those do you intend to grab? It would generally help a lot if you use words like "lvalue", "rvalue", "implicit lvalue-to-rvalue cast" (i believe that this cast kind most likely requires special treatment in your implementation).

clang/lib/StaticAnalyzer/Checkers/WebKit/RefCntblBaseVirtualDtorChecker.cpp
23

Aha, yup, ok. We have too few checkers for checkASTDecl<CXXRecordDecl> to figure out whether we want shouldVisitTemplateInstantiations() on or off by default. I think most checkers would ultimately want it on, but on the other hand it is challenging to provide a good diagnostic when you're describing a problem in an instantiation but all your source locations point to the original template. So it'd be an unobvious way to shoot yourself in the foot if we had it enabled by default and some checkers didn't take it into account. You printed out the type with template parameters and this seems more or less sufficient (though a note at the point of instantiation could also have helped). So let's keep it as is!

jkorous updated this revision to Diff 263915.May 13 2020, 9:35 PM

Addressed comments:

  • removed utils not necessary for this patch
  • added documentation
  • clang-format

I added documentation for the checker to clang/docs/analyzer/checkers.rst since clang/www/analyzer/available_checks.html seems to be out of date. Shall I still add it there too?

NoQ accepted this revision.May 21 2020, 8:08 AM

Thank you, let's land this!

This revision is now accepted and ready to land.May 21 2020, 8:08 AM
NoQ added a comment.EditedMay 21 2020, 8:14 AM

I added documentation for the checker to clang/docs/analyzer/checkers.rst since clang/www/analyzer/available_checks.html seems to be out of date. Shall I still add it there too?

Yeah, we'll have to clean it up eventually, don't worry for now.

clang/docs/analyzer/checkers.rst
1383 ↗(On Diff #263915)

It might be useful to precede the list with an explanation of what that list is of. Eg., "The checker reacts on the following WebKit classes and methods:" or "Hereinafter the following terminology is used for describing the available checkers:" or something along those lines.

This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptMay 21 2020, 11:55 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript
thakis added a subscriber: thakis.May 21 2020, 12:50 PM

This breaks the build everywhere (e.g. http://45.33.8.238/linux/18283/step_4.txt) and has been in for over an hour. Please watch bots after landing. (I think changes that are created more recently automatically get presubmit compile testing on phab.) Reverted in 1108f5c737dbdab0277874a7e5b237491839c43a for now.

About the patch itself: Project-specific checks like this usually go in clang-tidy, not in the static analyzer (which ships with the compiler binary).

NoQ added a comment.May 21 2020, 1:27 PM

Reverted in 1108f5c737dbdab0277874a7e5b237491839c43a for now.

Uh-oh. Thank you for reverting!

Project-specific checks like this usually go in clang-tidy, not in the static analyzer (which ships with the compiler binary).

There are a lot more considerations to be taken into account when choosing where to put the check. The static analyzer already contains a lot of project-specific and framework-specific checks which will never be implemented in clang-tidy because they're path-sensitive. Also these tools are not a drop-in replacement of each other: you cannot use clang-tidy in all the places where you can use the static analyzer (the opposite is partially true due to libStaticAnalyzer integration into clang-tidy but UI degrades dramatically when such integration is used) therefore many authors simply do not have a choice where to put the check (i'm slowly working on improving this situation so that most path-insensitive checks could indeed be migrated into clang-tidy but we're not there yet). So as of today we have no choice but to let our contributors decide freely where they want their checks to be.

clang/include/clang/StaticAnalyzer/Checkers/Checkers.td
1513

While we're at it, maybe remove "WebKit" from the checker name, given that it duplicates the package name?

This breaks the build everywhere (e.g. http://45.33.8.238/linux/18283/step_4.txt) and has been in for over an hour. Please watch bots after landing. (I think changes that are created more recently automatically get presubmit compile testing on phab.) Reverted in 1108f5c737dbdab0277874a7e5b237491839c43a for now.

Sorry about the mess! I shouldn't have landed the patch just before I stepped out for 2 hrs...