Page MenuHomePhabricator

[clang-tidy] Support for Static Analyzer plugins
AbandonedPublic

Authored by xazax.hun on May 7 2015, 5:30 AM.

Details

Reviewers
None
Summary

Right now it is not possible to load and run static analyzer plugins when using clang tidy.

This patch makes it possible. However the list of possible checkers now depends upon the command line.

Note that, this patch also needs clang support.

The clang part is here: http://reviews.llvm.org/D9556

Diff Detail

Event Timeline

xazax.hun updated this revision to Diff 25168.May 7 2015, 5:30 AM
xazax.hun retitled this revision from to [clang-tidy] Support for Static Analyzer plugins.
xazax.hun updated this object.
xazax.hun edited the test plan for this revision. (Show Details)
xazax.hun added reviewers: alexfh, klimek, djasper.
xazax.hun added a subscriber: Unknown Object (MLST).
djasper removed a reviewer: djasper.May 7 2015, 5:32 AM
klimek edited edge metadata.May 7 2015, 5:32 AM

I like the idea. Unless anybody else has objections, I think this is good to go in.

xazax.hun updated this object.May 7 2015, 5:39 AM
xazax.hun edited edge metadata.
xazax.hun added inline comments.May 7 2015, 5:46 AM
clang-tidy/tool/ClangTidyMain.cpp
310

My biggest concern so far is that, the list of available checkers are determined dynamically once the compiler is instantiated. For this reason I had to move check listing and some error diagnostic after actually running the tool. This might not be a too big usability issue though, since it was not possible to list the available checkers without input files anyways. However it would be harder to improve on that after this patch is applied.

alexfh added inline comments.May 7 2015, 6:13 AM
clang-tidy/tool/ClangTidyMain.cpp
310

A clarification regarding the current state: a file name on the command line is currently needed to find a directory to read configuration from. The file doesn't have to exist as the analysis isn't run when we just want to list the checks.

As for the requirement to run the analysis in order to get the list of checks: I don't think it's a good idea. Do you just need this to parse the command line?

xazax.hun added inline comments.May 7 2015, 6:18 AM
clang-tidy/tool/ClangTidyMain.cpp
310

Yes, I think that I only need the plugin list from the command line. If there is a nice way to get that without actually running a clang tool that could work this issue around.

alexfh added inline comments.May 7 2015, 6:29 AM
clang-tidy/tool/ClangTidyMain.cpp
310

It should be possible to reuse command-line parsing on some level. I'd prefer a level where we don't need an actual input file.

xazax.hun added inline comments.May 7 2015, 6:40 AM
clang-tidy/tool/ClangTidyMain.cpp
310

My first attempt was to delegate most of the work to BeginInvocation of ASTFrontendAction, but it will not be triggered when the checked file does not exist. But I will try to parse command line before running the tool and submit a new patch.

xazax.hun updated this revision to Diff 25462.May 11 2015, 6:02 AM
  • New approach to get the list of static analyzer checkers dynamically:
  • Manually create a new compiler invocation object from the command line and get the plugin list from that invocation object.

Clang part:
tooling::newDriver and tooling::getCC1Arguments was not the part of the public API. To make this patch work, they need to be added to the public API. If you find this approach a good enough I will update the clang part of this patch.

alexfh edited edge metadata.May 11 2015, 7:44 AM

Does this only work with the fixed compilation database (when the compiler options are specified after a '--' on the command line of clang-tidy)? I find this problematic for at least two reasons: Maybe we need to add first-class clang-tidy options to load static analyzer plugins?

Oops, hit "Submit" early.

So, I find this approach problematic for a couple of reasons:

  1. lack of support for other compilation databases makes the feature hardly usable for many use cases (fixed compilation database is the most basic case which doesn't work well for real projects).
  2. getting the list of plugins from the file's compiler arguments seems like a wrong approach. The list of available checks becomes dependent on the build configuration of the project.

I think, a better approach is to add a clang-tidy option for loading a static analyzer plugin and feed it to the static analyzer.

xazax.hun updated this revision to Diff 26319.May 22 2015, 5:57 AM
xazax.hun edited edge metadata.

Clang static analyzer plugin is now passed as a clang tidy argument.
If you find this approach reasonable I will update the clang part of this patch as well.

Small bump, just in case this got forgotten.

alexfh added a comment.Jun 3 2015, 7:07 AM

Small bump, just in case this got forgotten.

Yes, I actually forgot to reply. Feel free to ping me each week or so, in case I don't reply.

Now to the patch itself. I have a number of concerns, please see the comments inline.

clang-tidy/ClangTidy.cpp
68

It was fine to have this as a global variable while it was a compile-time constant. But it's a bad idea to use a global mutable variable to pass configuration to the library. Actually, I'd leave the compile-time checks list as a fallback, and only override it if clang-tidy is instructed to use a static analyzer plugin.

416

Is the only useful effect of this function to fill the StaticAnalyzerChecks variable? It's bad on its own as any other interface which only useful effect is side-effect.

AFAIU, you still rely on the compilation arguments to instruct the static analyzer that it needs to load a plugin? That doesn't sound like a good idea either for multiple reasons, most importantly that:

  1. This requires modification of the compilation arguments in the build system in order to run an analysis implemented in a plugin. Looks like an overkill and a totally wrong way of doing this.
  2. Compile arguments will affect what is loaded and executed by clang-tidy, which also seems wrong. If it is already the case, it was definitely unintended and we need to fix this instead of relying on this.
  3. 1 and 2 also mean that for correct usage clang-tidy's -plugin-path option will always need to be in sync with the -load options in the compilation database (read: in the build system). Extremely brittle and confusing.
428

BTW, does this option rely on the plugin support turned on at build time? If yes, we need to make most of this code #ifdefed out using the same preprocessor macros.

clang-tidy/tool/ClangTidyMain.cpp
149

Does static analyzer support only one plugin at a time?

Also, it _may_ be better to put this into ClangTidyOptions, but I'm not sure yet. Upsides are that then we'd avoid modification of most other interfaces. Obvious downsides are that this option only makes sense for the command-line front-end.

Can you explain your use case in more detail?

270

nit: Remove the empty line.

clang-tidy/tool/Makefile
20

nit: This should be above the empty line.

xazax.hun added inline comments.Jun 4 2015, 1:08 AM
clang-tidy/ClangTidy.cpp
68

It is a good idea to fall back to the checker list generated from header files.

What do you think, where should I put the mutable vector? In ClangTidyOptions?

416

Yes, the only effect of this function is to initialize the checker list. What else would you prefer? Once the mutable list is the member of ClangTidyOptions, I could make this function a method.

With plugin-path argument, the command line of compilation do not need to be modified from the user's point of view. I only forgot to update the test case. Once the plugin path is given in the command line, this method should just work with any kind of unmodified compilation database.

The reason why it is not needed to add the -load argument to the command line:

This function loads the plugin which is kept in the memory for the whole runtime. Upon plugin load, the checkers are registered. So even though the load argument is not given to the frontend action that does the checking, the plugin already in the memory and the checkers already are registered, so they will be used.

If you find this method hacky or not future proof, I could create an argument adjuster object here which could be used during the next steps. However StaticAnalyzerChecks have to be initialized early enough so that the user can list the checkers that are loaded from the plugin.

428

Good point, I think yes, this plugin support requires that it is turned on compile time.

clang-tidy/tool/ClangTidyMain.cpp
149

I never tried to use multiple plugins, however you are right, the static analyzer supports to load multiple ones at the same time. I see your point to put this into ClangTidyOptions, so one can configure it using yaml files.

My usecase is the following:
We have a script, that uses LD_PRELOAD to load a shared object that hijacks the exec function call family and filters and logs compiler calls. After this log file is created a script invokes clang tidy with slightly modified compilation arguments. This way we can support any build system on posix systems including incremental build support.We just got the permission to open source this tool set, so it is possible that we will try to upstream this tool soon.

xazax.hun updated this revision to Diff 27096.Jun 4 2015, 1:11 AM
  • The global static analyzer checker list now initialized with the static list. It is only updated if plugin-path command line argument is given.
  • Fixed some nits.
alexfh added inline comments.Jun 5 2015, 8:00 AM
clang-tidy/ClangTidy.cpp
68

See the comment below.

Also, please make this variable an array of StringRefs again.

416

Yes, the only effect of this function is to initialize the checker list.

Well, this contradicts to the explanation of how plugins are loaded that you give below ;) But now I get it, and looks like the function should be called "loadPlugins" or similar in order to make the important side-effect obvious from the function name. And as we now rely on a mutable global state anyway, it may be fine to introduce (#ifdefed only when plugins are enabled) an additional global vector to store dynamically constructed static analyzer check list. At least I don't see a substantially better alternative.

Filtering out -load arguments from the arguments we get from a compilation database looks like a useful thing to do unless there's a better way to disable loading of plugins during the analysis.

clang-tidy/tool/ClangTidyMain.cpp
149

I'm still not sure about moving the "-plugin-path" option to ClangTidyOptions.

My usecase is the following:
We have a script, that uses LD_PRELOAD to load a shared object that
hijacks the exec function call family and filters and logs compiler calls.

Similar to this one? https://github.com/rizsotto/Bear

After this log file is created a script invokes clang tidy with slightly
modified compilation arguments. This way we can support any build
system on posix systems including incremental build support.

You mean you run analyses on each build? Sounds interesting, though there's a (possibly more effective) alternative: to run analyses as a part of submitting the code for review.

But anyway, that doesn't answer why do you need plugins and how you use them.

xazax.hun added inline comments.Jun 8 2015, 12:54 AM
clang-tidy/tool/ClangTidyMain.cpp
149

Oh, our solution is very similar to Bear indeed thanks for pointing this out.
We do not run the analysis on each build. We have a script, that can run the analysis as the part of a build, and the user can decide when to run it ( or it can be automated by commit hooks etc). So one can only recheck the translation units that he modified.

About the plugins, we use them for two reasons:

  1. It is faster to link a plugin shared object than linking clang, so we can iterate faster with changes.
  2. We find it cleaner that most of the code we write is out of the clang source tree. This way we don't end up using an internal clang fork, and our repository is smaller.
dkrupp added a subscriber: dkrupp.Jun 8 2015, 1:54 AM
babati added a subscriber: babati.Jun 8 2015, 2:03 AM
alexfh added a comment.Jun 8 2015, 7:33 AM

After thinking a bit more, I prefer to leave -plugin-path command-line option at least for now, because it doesn't make sense for other clang-tidy frontends.

clang-tidy/tool/ClangTidyMain.cpp
149

... So one can only recheck the translation units that he modified.

Unrelated to this patch: did you think about using clang-tidy-diff.py or a similar VCS-based approach to define what "modified files" are?

About the plugins, we use them for two reasons:

  1. It is faster to link a plugin shared object than linking clang, so we can iterate faster with changes.
  2. We find it cleaner that most of the code we write is out of the clang source tree. This way we don't end up using an internal clang fork, and our repository is smaller.

Thanks for explanation. Hopefully, clang-tidy with its compile-time extensibility improves #2 by allowing to keep checks in a separate tree more conveniently. Is #1 a big concern for you?

xazax.hun added inline comments.Jun 10 2015, 12:20 AM
clang-tidy/tool/ClangTidyMain.cpp
149

Unrelated to this patch: did you think about using clang-tidy-diff.py or a similar VCS-based approach to define what "modified files" are?

For fast iterative check (when all flow sensitive checkers are disabled), VCS based solution is a very good choice.

However in our opinion, right before commit, it is not good enough to run checks only on changes files. For example after changing a header file the tool might find new issues in any translation unit that includes that header file.
Another problem is with generated files. At Ericsson there are several interface definition languages that defines the interfaces of services. C/C++ code is generated from those IDL files during build. In case the user runs a build before running the VCS based solution it should work well, but to be less error prone we used another approach. Using the build system to invoke clang-tidy (using something like bear) ensures that these generated files are up to date.

Thanks for explanation. Hopefully, clang-tidy with its compile-time extensibility improves #2 by allowing to keep checks in a separate tree more conveniently. Is #1 a big concern for you?

Unfortunately some of the products is still developed in very out to date environments. We had to support those outdated OS-es, and the way we do it, we build our packages and run the tests for those projects in a virtualized environment. The performance penalty from this makes #1 a big concern for us right now.

alexfh added inline comments.Jun 10 2015, 8:26 AM
clang-tidy/tool/ClangTidyMain.cpp
149

Fair enough. Thanks for the explanation!

klimek resigned from this revision.Jul 3 2015, 6:40 AM
klimek removed a reviewer: klimek.
alexfh removed a reviewer: alexfh.Nov 5 2015, 2:31 PM
xazax.hun abandoned this revision.Nov 5 2015, 2:32 PM