[frontend] [analyzer] Provide an option to load a checker from a declarative file
Needs ReviewPublic

Authored by george.karpenkov on Tue, Jul 10, 6:44 PM.

Details

Summary

Many non-compiler-engineers would like to write simple AST checkers.
The necessity to compile the compiler often becomes an impenetrable obstacle for them (further complicated by the fact that in a corporate setup often a *specific* version of the compiler is required to compile the given project).

This patch adds a an ability to load a checker from a YAML file with a very simple grammar, significantly lowering the entry bar for the authorship to AST-based checkers.

Short FAQ I anticipate:

Q: Why not just use clang-query?
A: This is a very similar approach, but meant to be friendly to the build system, so the new checker can be run on the entire codebase at once.

Q: Why not put it in clang-tidy?
A: Again, in order to be friendly to the build system. With this setup, it's sufficient to add an extra line to CFLAGS. Doing that with clang-tidy could be more difficult, as not all projects are based on a compilation database (and currently Apple does not even ship clang-tidy in the toolchain).

Q: How expensive is linking Clang with dynamic AST matchers?
A: We already link with static AST matchers, so this change should not matter much. Changes I've seen are below few percent.

Q: Why not give the ability to load multiple files which checker descriptions at once?
A: Currently it's not done due to a technical limitation in CheckerRegistry, to be addressed in subsequent patches.

Diff Detail

That's an awesome idea! Did you think about supporting automated fix-it suggestions?

clang/include/clang/StaticAnalyzer/Core/AnalyzerOptions.h
135

The name is quite verbose, but misses the important detail. DeclarativeCheckerFiles or DeclarativeCheckerFileNames, maybe?

clang/include/clang/StaticAnalyzer/Frontend/DeclarativeChecker.h
36

Why Matcher<Decl>? What if the user needs to match a Stmt?
Can we avoid using a class from the internal namespace of ast_matchers?

37

llvm::StringMap<std::string>?

clang/lib/StaticAnalyzer/Frontend/CheckerRegistration.cpp
130

const std::string &?

clang/lib/StaticAnalyzer/Frontend/DeclarativeChecker.cpp
33

nit: I'd remove the empty line at the top.

125–128

I'd put this branch first and remove the else.

126

Do you want to print the error code as well?

131

Maybe return llvm::ErrorOr<...> instead of llvm::Optional<...> and report the errors outside of this method?

153–154

Why can't we store a DynTypedMatcher without converting it to Matcher<Decl>? The former is a public interface and the latter is an internal one.

159–168

The returned function is executed (almost) right after this method returns. This pattern looks a bit strange. Can the method just call Registry.addChecker directly? Or can it return DeclarativeChecker (or llvm::ErrorOr<DeclarativeChecker>)?

clang/test/Analysis/declarative/failed_matcher.yaml
9–11

Is (((()))) actually a valid matcher for a variable declaration?

clang/include/clang/StaticAnalyzer/Core/AnalyzerOptions.h
135

Yep, should be fixed.

clang/include/clang/StaticAnalyzer/Frontend/DeclarativeChecker.h
36

Hm, so I've started with a DynTypedMatcher, but I think that makes code worse.
The current code force-wraps everything into a decl matcher, with forEachDescendant inside, as there's always a top-level TranslationUnitDecl.

37

Yep. Also I do think that performance nitpick in the driver code somewhat miss the point :P

clang/lib/StaticAnalyzer/Frontend/CheckerRegistration.cpp
130

yep

clang/lib/StaticAnalyzer/Frontend/DeclarativeChecker.cpp
33

ok

126

Do you know how to translate it to something useful? Where are those codes are even defined?

131

and then do what with an error code? Also outside this method we can't tell whether we've failed due to file not found / YAML parsing failed / AST matcher parsing failed.

159–168

Hm, I'm not sure. It could be changed to call Registry.addChecker, but that somewhat violates layering, as it's nice to see all calls to the registry in the registry handling .cpp file.

clang/test/Analysis/declarative/failed_matcher.yaml
9–11

The point of this test is to check that we get a reasonable error message from the parser.

That's an awesome idea!

Thanks!

Did you think about supporting automated fix-it suggestions?

Not really, I think of this as a "semantic-aware grep".

george.karpenkov marked 2 inline comments as done.Wed, Jul 11, 6:20 PM
george.karpenkov added inline comments.
clang/lib/StaticAnalyzer/Frontend/DeclarativeChecker.cpp
33

actually, no: I think an empty line above a doxygen comment improves readability.

153–154

OK so I've started with a DynTypedMatcher.
It has a couple of problems:

  • We have to use the annoying custom callback and can't just iterate over results
  • We can't store it, as adding it to the finder can fail, and it's too late to bail out with a diagnostics by then
  • We can store the finder, to which the matcher is already added. But then we have to also initialize it pointing to an existing container to put matches into, which is wrong at checker creation type.
NoQ added inline comments.Thu, Jul 12, 12:16 PM
clang/lib/StaticAnalyzer/Frontend/DeclarativeChecker.cpp
114–116

Hmm, maybe first parse the string, and then add decl(forEachDescendant(...)) around the matcher object?

178–179

Maybe warn the developer so that they don't spend hours figuring out why their matcher isn't matching anything?