This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
clang-tools-extra/docs/clang-tidy/
-
docs/
-
clang-tidy/
48/48
Contributing.rst

Differential D117939

[clang-tidy] Add more documentation about check development
ClosedPublic

Authored by LegalizeAdulthood on Jan 21 2022, 5:55 PM.

Download Raw Diff

Details

Reviewers

alexfh
aaron.ballman
ymandel
njames93

Commits

rG8ce99dadb007: [clang-tidy] Add more documentation about check development (NFC)

Summary

Mention pp-trace
CMake configuration
Overriding registerPPCallbacks
Check development tips
- Guide to useful documentation
- Using the Transformer library
- Developing your check incrementally
- Creating private matchers
- Unit testing helper code
- Making your check robust
- Documenting your check
Describe the Inputs test folder

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

LegalizeAdulthood created this revision.Jan 21 2022, 5:55 PM

Herald added a subscriber: xazax.hun. · View Herald TranscriptJan 21 2022, 5:55 PM

LegalizeAdulthood requested review of this revision.Jan 21 2022, 5:55 PM

LegalizeAdulthood edited the summary of this revision. (Show Details)

Spelling

Harbormaster completed remote builds in B144964: Diff 402153.Jan 21 2022, 6:38 PM

It's also make sense to mention isLanguageVersionSupported.

clang-tools-extra/docs/clang-tidy/Contributing.rst
79	Clang.
233	Excessive newline.
238	Clang.
244	Clang.
266	Clang.
340	API.
386	Link to Release Notes?

In D117939#3263180, @Eugene.Zelenko wrote:

It's also make sense to mention isLanguageVersionSupported.

Good idea.

Update from review comments

Harbormaster completed remote builds in B145030: Diff 402238.Jan 22 2022, 11:09 AM

LegalizeAdulthood added inline comments.Jan 22 2022, 2:22 PM

clang-tools-extra/docs/clang-tidy/Contributing.rst
296	Should be `functions`
339	chunks plural, "an intention revealing name" singular mismatch

Eugene.Zelenko added inline comments.Jan 22 2022, 5:58 PM

clang-tools-extra/docs/clang-tidy/Contributing.rst
233	Other languages and their versions are applicable too.

Update from review comments

Clarify ninja build example

Eugene.Zelenko added inline comments.Jan 22 2022, 9:05 PM

clang-tools-extra/docs/clang-tidy/Contributing.rst
233	Objective-C is also supported.

Harbormaster completed remote builds in B145066: Diff 402284.Jan 22 2022, 9:14 PM

salman-javed-nz added a subscriber: salman-javed-nz.Jan 23 2022, 5:52 AM

salman-javed-nz added inline comments.

clang-tools-extra/docs/clang-tidy/Contributing.rst
76	Don't you need to enable both clang and clang-tools-extra? Otherwise the clang-tidy CMake target doesn't appear. That has been my experience.

LegalizeAdulthood marked 2 inline comments as done.Jan 23 2022, 7:01 AM

LegalizeAdulthood added inline comments.

clang-tools-extra/docs/clang-tidy/Contributing.rst
233	I copied the doxygen language for `LangOptions` from `LangOptions.h`.

Update from review comments

Harbormaster completed remote builds in B145096: Diff 402317.Jan 23 2022, 7:16 AM

Thank you so much for working on this documentation, I really like the direction it's going!

clang-tools-extra/docs/clang-tidy/Contributing.rst
78
228
229	As usual, we're not super consistent, but most of our documentation is single-spaced (can you correct this throughout your changes?).
233	I made it more generic, you can use this for more than just checking languages (you can check for other language features like whether `char` is signed or unsigned, etc).
262	I'd argue the most important thing you interact with from Clang are the AST nodes. Maybe instead of "most important", we can be vague and just say "Some commonly used features are" or something like that?
301–317	I think this documentation is really good, but at the same time, I don't think we have any clang-tidy checks that make use of the transformer library currently. I don't see a reason we shouldn't document this more prominently, but I'd like to hear from @ymandel and/or @alexfh whether they think the library is ready for officially supported use within clang-tidy and whether we need any sort of broader community discussion. (I don't see any issues here, just crossing t's and dotting i's in terms of process.)
339–340
343
360–361	Do we have any private matchers that are unit tested? My understanding was that the public matchers were all unit tested, but that the matchers which are local to a check are not exposed via a header file, and so they're not really unit testable.
370	Reworded to avoid a loaded term and not make it sound C++ specific; needs re-flowing to 80-col limits
372–373
383	Another one that catches people out is testing on Windows vs non-Windows targets (as targets which are compatible with MSVC default to a different template instantiation behavior). One key thing that's not discussed explicitly are false positive rates. Clang-tidy can have significantly higher false positive rates than diagnostics in Clang or the static analyzer, but we still care about not being overly chatty. But "overly chatty" depends on the check -- if the check is for a coding standard and the coding standard says "no calls to 'foo()'", then it's not chatty to diagnose every call to "foo()". But if the check is not following a coding standard, but is instead trying to help someone modernize their code, find bugs, or make it more readable (as examples), then perhaps diagnosing every call to "foo()" will be too chatty. So we ask people to test their code against real world code bases to try to gauge what the false positive and true positive rates are for the check, just to make sure they seem reasonable. Another thing we may want to mention is that checks can be configured. This helps not only with exposing different functionality for the check, but also can help to control perceived false positives for some use cases.

@aaron.ballman I think this has exposed some limitations with the add-new-check script. Maybe there is merit for the script to be updated to support preprocessor callbacks and options, WDYT?

LegalizeAdulthood added inline comments.Jan 24 2022, 7:09 AM

clang-tools-extra/docs/clang-tidy/Contributing.rst
229	People keep asking for this, but it doesn't matter for the final output and it isn't specified anywhere in the style guide.
233	This came up in another review, but if you have a check that applies to C++11 or later, do you have to check all the versions or can you assume that the C++11 flag will be set when C++14 is requested via command-line options?
301–317	There are at least two checks that use the Transformer library currently.
360–361	I just did this for my switch/case/label update to simplify boolean expressions. You do have to expose them via a header file, which isn't a big deal.

In D117939#3266228, @njames93 wrote:

@aaron.ballman I think this has exposed some limitations with the add-new-check script. Maybe there is merit for the script to be updated to support preprocessor callbacks and options, WDYT?

It could certainly use an option to specify that your check is Transformer
based.

In D117939#3266228, @njames93 wrote:

@aaron.ballman I think this has exposed some limitations with the add-new-check script. Maybe there is merit for the script to be updated to support preprocessor callbacks and options, WDYT?

I wouldn't be opposed to it. I think it currently serves the majority of the needs (there are far less preprocessor checks than there are AST checks), but making it more full-featured would be a win for some folks.

clang-tools-extra/docs/clang-tidy/Contributing.rst
229	We're consistently inconsistent, but the reason why I think we tend to prefer single space is because it's less bytes for everyone to download (as you say, single vs double space doesn't matter for the generated output) and generally we're all reading this on computer screens rather than in print (where double spacing actually helped readability).
233	Later modes will also set earlier modes. e.g., passing `-std=c++20` on the command line will set `CPlusPlus`, `CPlusPlus11`, `CPlusPlus14`, `CPlusPlus17`, and `CPlusPlus20` in `LangOptions`.
301–317	The only mentions of `TransformerClangTidyCheck.h` that I can find are in `ClangTransformerTutorial.rst` and `clang-formatted-files.txt`, so if there are other checks using this functionality, they're not following what's documented here.
360–361	You do have to expose them via a header file, which isn't a big deal. Then they become part of the public interface for the check and anything which includes the header file has to do heavy template instantiation work that contributes more symbols to the built library. In general, we don't expect private matchers to be unit tested -- we expect the tests for the check to exercise the matcher appropriately.

LegalizeAdulthood marked 16 inline comments as done.Jan 24 2022, 7:14 PM

LegalizeAdulthood added inline comments.

clang-tools-extra/docs/clang-tidy/Contributing.rst
301–317	`CleanupCtadCheck`, `StringFindStrContainsCheck`, and `StringviewNullptrCheck` all derive from `TransformerClangTidyCheck`. The first two are in the `abseil` module and the last is in the `bugprone` module.
360–361	Look at my review to see how I handled it; the matchers are in a seperate header file, in my case `SimplifyBooleanExprMatchers.h` and aren't exposed to consumers of the check. The matchers header is only included by the implementation and the matcher tests.

Update from review comments

Aside from the unit testing bit, I think this is fantastic! (And the unit testing bit may also be fantastic, I just think it needs more explicit discussion with a wider audience.)

clang-tools-extra/docs/clang-tidy/Contributing.rst
301–317	Oh, interesting, thanks for pointing that out! It turns out that that `clang` and `clang-tools-extra` are different sibling directories and searching `clang` for things you expect to find in `clang-tools-extra` is not very helpful. :-D My concerns are no longer a concern here.
360–361	Thanks for pointing out how you're doing it in one of your checks -- I still don't think we should document that we expect people to unit test private matchers unless clang-tidy reviewers are on board with the idea in general. IMO, that's putting a burden on check authors and all the reviewers to do something that's never been suggested before (let alone documented as a best practice) -- that's worthy of open discussion instead of adding it to a patch intended to document current practices.

Overall, this looks fantastic! You may want to consider (in a separate patch) mentioning godbolt.org, which is a great UI for interacting with clang-query and the AST. Example configuration: https://godbolt.org/z/v88W8ET19

In D117939#3266234, @LegalizeAdulthood wrote:

In D117939#3266228, @njames93 wrote:

@aaron.ballman I think this has exposed some limitations with the add-new-check script. Maybe there is merit for the script to be updated to support preprocessor callbacks and options, WDYT?

It could certainly use an option to specify that your check is Transformer
based.

Agreed. Happy to do this if there's interest. I already have a (google) internal version of this script that does exactly that. In fact, we set it as the default, and require users to explicitly opt out. It is our preference for all new clang tidy checks.

clang-tools-extra/docs/clang-tidy/Contributing.rst
233–234	nit: "be sure"? (just to vary the language a bit)
301–317	My 2cents: definitely comfortable encouraging adoption. In fact, we require it on internal checks unless the user has a strong reason to opt out.
360–361	Agreed on this point -- that we should push this off to discussion on a separate patch. I'm definitely fine with pointing out that unit testing is possible, since that may not be obvious. But, I'd have to be convinced that we want to insist on it.
389–390	Is it worth giving a rule-of-thumb? Personally I'd go with < 10%, all things being equal. A check for a serious bug may reasonably have a higher false positive rate, and trivial checks might not justify any false positives. But, if neither of these apply, I'd recommend 10% as the default.

LegalizeAdulthood marked an inline comment as done.Jan 25 2022, 2:00 PM

LegalizeAdulthood added inline comments.

clang-tools-extra/docs/clang-tidy/Contributing.rst
362	If I change the wording from "is best tested with a unit test" to "can be tested with a unit test", would that alleviate the concern? I want to encourage appropriate testing and unit testing complex helper code is better than integration testing helper code. I find it easier to have confidence in private matchers if they are unit tested and I've recently had a few situations where I had to write relatively complex helper functions to analyze raw text that I felt would have been better tested with a unit test than an integration test.
389–390	I'm OK with rule-of-thumb 10% advice.

Update from review comments

aaron.ballman added inline comments.Jan 26 2022, 7:01 AM

clang-tools-extra/docs/clang-tidy/Contributing.rst

362

If I change the wording from "is best tested with a unit test" to "can be tested with a unit test", would that alleviate the concern?

I largely addresses mine -- saying it can be done is great, saying it should be done is what gave me pause.

389–390

FWIW, I think 10% is pretty arbitrary and I'd rather not see us try to nail it down to a concrete number. In practical terms, it really depends on the check.

Also, clang-tidy is where we put things with a "high" false positive rate already, so this statement has implications on what an acceptable false positive rate is for Clang or the static analyzer.

How about something along these lines:

- Watch out for high false positive rates. Ideally, a check would have no false positives, but given that matching against an AST is not control- or data flow-sensitive, a number of false positives are expected. The higher the false positive rate, the less likely the check will be adopted in practice. Mechanisms should be put in place to help the user manage false positives.
- There are two primary mechanisms for managing false positives: supporting a code pattern which allows the programmer to silence the diagnostic in an ad hoc manner and check configuration options to control the behavior of the check.
- Consider supporting a code pattern to allow the programmer to silence the diagnostic whenever such a code pattern can clearly express the programmer's intent. For example, allowing an explicit cast to `void` to silence an unused variable diagnostic.
- Consider adding check configuration options to allow the user to opt into more aggressive checking behavior without burdening users for the common high-confidence cases.

(or something along those lines). The basic idea I have there is: false positives are expected, try to keep them to a minimum, and here are the two most common ways code reviewers will ask you to handle false positives when they're a concern.

ymandel added inline comments.Jan 26 2022, 10:53 AM

clang-tools-extra/docs/clang-tidy/Contributing.rst
362	+1
389–390	Strongly agree. 10% has served as well in practice for the threshold at which we disable/fix checks, but it's definitely arbitrary. I much prefer your suggested approach.

salman-javed-nz removed a subscriber: salman-javed-nz.Jan 26 2022, 11:30 AM

LegalizeAdulthood marked 2 inline comments as done.Jan 26 2022, 1:05 PM

LegalizeAdulthood added inline comments.

clang-tools-extra/docs/clang-tidy/Contributing.rst
389–390	Yeah, I wasn't a fan of a magic number style piece of advice. I like the reworded suggestion better.

Update from review comments

Harbormaster completed remote builds in B145838: Diff 403386.Jan 27 2022, 5:36 AM

ymandel accepted this revision.Jan 27 2022, 5:49 AM

This revision is now accepted and ready to land.Jan 27 2022, 5:49 AM

LGTM, thank you for these fantastic improvements to the docs!

Closed by commit rG8ce99dadb007: [clang-tidy] Add more documentation about check development (NFC) (authored by LegalizeAdulthood). · Explain WhyJan 27 2022, 8:44 AM

This revision was automatically updated to reflect the committed changes.

LegalizeAdulthood added a commit: rG8ce99dadb007: [clang-tidy] Add more documentation about check development (NFC).

Revision Contents

Path

Size

clang-tools-extra/

docs/

clang-tidy/

Contributing.rst

225 lines

Diff 403677

clang-tools-extra/docs/clang-tidy/Contributing.rst

Show All 16 Lines

and precise checks in just a few lines of code. If you have an idea for a good and precise checks in just a few lines of code. If you have an idea for a good

check, the rest of this document explains how to do this. check, the rest of this document explains how to do this.

There are a few tools particularly useful when developing clang-tidy checks: There are a few tools particularly useful when developing clang-tidy checks:

* ``add_new_check.py`` is a script to automate the process of adding a new * ``add_new_check.py`` is a script to automate the process of adding a new

check, it will create the check, update the CMake file and create a test; check, it will create the check, update the CMake file and create a test;

* ``rename_check.py`` does what the script name suggests, renames an existing * ``rename_check.py`` does what the script name suggests, renames an existing

check; check;

* :program:`pp-trace` logs method calls on `PPCallbacks` for a source file

and is invaluable in understanding the preprocessor mechanism;

* :program:`clang-query` is invaluable for interactive prototyping of AST * :program:`clang-query` is invaluable for interactive prototyping of AST

matchers and exploration of the Clang AST; matchers and exploration of the Clang AST;

* `clang-check`_ with the ``-ast-dump`` (and optionally ``-ast-dump-filter``) * `clang-check`_ with the ``-ast-dump`` (and optionally ``-ast-dump-filter``)

provides a convenient way to dump AST of a C++ program. provides a convenient way to dump AST of a C++ program.

If CMake is configured with ``CLANG_TIDY_ENABLE_STATIC_ANALYZER=NO``, If CMake is configured with ``CLANG_TIDY_ENABLE_STATIC_ANALYZER=NO``,

:program:`clang-tidy` will not be built with support for the :program:`clang-tidy` will not be built with support for the

``clang-analyzer-*`` checks or the ``mpi-*`` checks. ``clang-analyzer-*`` checks or the ``mpi-*`` checks.

Show All 32 Lines

Once you are done, change to the ``llvm/clang-tools-extra`` directory, and Once you are done, change to the ``llvm/clang-tools-extra`` directory, and

let's start! let's start!

.. _Getting Started with the LLVM System: https://llvm.org/docs/GettingStarted.html .. _Getting Started with the LLVM System: https://llvm.org/docs/GettingStarted.html

.. _Using Clang Tools: https://clang.llvm.org/docs/ClangTools.html .. _Using Clang Tools: https://clang.llvm.org/docs/ClangTools.html

.. _How To Setup Clang Tooling For LLVM: https://clang.llvm.org/docs/HowToSetupToolingForLLVM.html .. _How To Setup Clang Tooling For LLVM: https://clang.llvm.org/docs/HowToSetupToolingForLLVM.html

When you `configure the CMake build <https://llvm.org/docs/GettingStarted.html#local-llvm-configuration>`_,

make sure that you enable the ``clang`` and ``clang-tools-extra`` projects to

salman-javed-nzUnsubmitted

Done

Don't you need to enable both clang and clang-tools-extra? Otherwise the clang-tidy CMake target doesn't appear. That has been my experience.

salman-javed-nz: Don't you need to enable both clang and clang-tools-extra? Otherwise the clang-tidy CMake…

build :program:`clang-tidy`.

Because your new check will have associated documentation, you will also want to install

aaron.ballmanUnsubmitted

Done

build :program:`clang-tidy`.

- Since your new check will have associated documentation, you will also want to install

+ Because your new check will have associated documentation, you will also want to install

`Sphinx <https://www.sphinx-doc.org/en/master/>`_ and enable it in the CMake configuration.

aaron.ballman:

`Sphinx <https://www.sphinx-doc.org/en/master/>`_ and enable it in the CMake configuration.

Eugene.ZelenkoUnsubmitted

Done

Clang.

Eugene.Zelenko: Clang.

To save build time of the core Clang libraries you may want to only enable the ``X86``

target in the CMake configuration.

The Directory Structure The Directory Structure

----------------------- -----------------------

:program:`clang-tidy` source code resides in the :program:`clang-tidy` source code resides in the

``llvm/clang-tools-extra`` directory and is structured as follows: ``llvm/clang-tools-extra`` directory and is structured as follows:

:: ::

▲ Show 20 Lines • Show All 129 Lines • ▼ Show 20 Lines .. code-block:: c++

} }

(If you want to see an example of a useful check, look at (If you want to see an example of a useful check, look at

`clang-tidy/google/ExplicitConstructorCheck.h `clang-tidy/google/ExplicitConstructorCheck.h

<https://github.com/llvm/llvm-project/blob/main/clang-tools-extra/clang-tidy/google/ExplicitConstructorCheck.h>`_ <https://github.com/llvm/llvm-project/blob/main/clang-tools-extra/clang-tidy/google/ExplicitConstructorCheck.h>`_

and `clang-tidy/google/ExplicitConstructorCheck.cpp and `clang-tidy/google/ExplicitConstructorCheck.cpp

<https://reviews.llvm.org/diffusion/L/browse/clang-tools-extra/trunk/clang-tidy/google/ExplicitConstructorCheck.cpp>`_). <https://reviews.llvm.org/diffusion/L/browse/clang-tools-extra/trunk/clang-tidy/google/ExplicitConstructorCheck.cpp>`_).

If you need to interact with macros or preprocessor directives, you will want to

aaron.ballmanUnsubmitted

Done

<https://reviews.llvm.org/diffusion/L/browse/clang-tools-extra/trunk/clang-tidy/google/ExplicitConstructorCheck.cpp>`_).

- If you need to interact macros and preprocessor directives, you will want to

+ If you need to interact with macros or preprocessor directives, you will want to

override the method ``registerPPCallbacks``. The ``add_new_check.py`` script

aaron.ballman:

override the method ``registerPPCallbacks``. The ``add_new_check.py`` script

aaron.ballmanUnsubmitted

Done

If you need to interact macros and preprocessor directives, you will want to

- override the method ``registerPPCallbacks``. The ``add_new_check.py`` script

+ override the method ``registerPPCallbacks``. The ``add_new_check.py`` script

does not generate an override for this method in the starting point for your

As usual, we're not super consistent, but most of our documentation is single-spaced (can you correct this throughout your changes?).

aaron.ballman: As usual, we're not super consistent, but most of our documentation is single-spaced (can you…

LegalizeAdulthoodAuthorUnsubmitted

Done

People keep asking for this, but it doesn't matter for the final output and it isn't
specified anywhere in the style guide.

LegalizeAdulthood: People keep asking for this, but it doesn't matter for the final output and it isn't specified…

aaron.ballmanUnsubmitted

Done

We're consistently inconsistent, but the reason why I think we tend to prefer single space is because it's less bytes for everyone to download (as you say, single vs double space doesn't matter for the generated output) and generally we're all reading this on computer screens rather than in print (where double spacing actually helped readability).

aaron.ballman: We're consistently inconsistent, but the reason why I think we tend to prefer single space is…

does not generate an override for this method in the starting point for your

new check.

If your check applies only under a specific set of language options, be sure

Eugene.ZelenkoUnsubmitted

Done

Excessive newline.

Eugene.Zelenko: Excessive newline.

Eugene.ZelenkoUnsubmitted

Done

Other languages and their versions are applicable too.

Eugene.Zelenko: Other languages and their versions are applicable too.

Eugene.ZelenkoUnsubmitted

Done

Objective-C is also supported.

Eugene.Zelenko: Objective-C is also supported.

LegalizeAdulthoodAuthorUnsubmitted

Done

I copied the doxygen language for LangOptions from LangOptions.h.

LegalizeAdulthood: I copied the doxygen language for `LangOptions` from `LangOptions.h`.

aaron.ballmanUnsubmitted

Done

new check.

- If your check applies only to specific dialect of C or C++, you will

+ If your check applies only under a specific set of language options, you will

want to override the method ``isLanguageVersionSupported`` to reflect that.

I made it more generic, you can use this for more than just checking languages (you can check for other language features like whether char is signed or unsigned, etc).

aaron.ballman: I made it more generic, you can use this for more than just checking languages (you can check…

LegalizeAdulthoodAuthorUnsubmitted

Done

This came up in another review, but if you have a check that applies
to C++11 or later, do you have to check all the versions or can you
assume that the C++11 flag will be set when C++14 is requested
via command-line options?

LegalizeAdulthood: This came up in another review, but if you have a check that applies to C++11 or later, do you…

aaron.ballmanUnsubmitted

Done

Later modes will also set earlier modes. e.g., passing -std=c++20 on the command line will set CPlusPlus, CPlusPlus11, CPlusPlus14, CPlusPlus17, and CPlusPlus20 in LangOptions.

aaron.ballman: Later modes will also set earlier modes. e.g., passing `-std=c++20` on the command line will…

to override the method ``isLanguageVersionSupported`` to reflect that.

ymandelUnsubmitted

Done

nit: "be sure"? (just to vary the language a bit)

ymandel: nit: "be sure"? (just to vary the language a bit)

Check development tips

----------------------

Eugene.ZelenkoUnsubmitted

Done

Clang.

Eugene.Zelenko: Clang.

Writing your first check can be a daunting task, particularly if you are unfamiliar

with the LLVM and Clang code bases. Here are some suggestions for orienting yourself

in the codebase and working on your check incrementally.

Guide to useful documentation

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Eugene.ZelenkoUnsubmitted

Done

Clang.

Eugene.Zelenko: Clang.

Many of the support classes created for LLVM are used by Clang, such as `StringRef

<https://llvm.org/docs/ProgrammersManual.html#the-stringref-class>`_

and `SmallVector <https://llvm.org/docs/ProgrammersManual.html#llvm-adt-smallvector-h>`_.

These and other commonly used classes are described in the `Important and useful LLVM APIs

<https://llvm.org/docs/ProgrammersManual.html#important-and-useful-llvm-apis>`_ and

`Picking the Right Data Structure for the Task

<https://llvm.org/docs/ProgrammersManual.html#picking-the-right-data-structure-for-a-task>`_

sections of the `LLVM Programmer's Manual

<https://llvm.org/docs/ProgrammersManual.html>`_. You don't need to memorize all the

details of these classes; the generated `doxygen documentation <https://llvm.org/doxygen/>`_

has everything if you need it. In the header `LLVM/ADT/STLExtras.h

<https://llvm.org/doxygen/STLExtras_8h.html>`_ you'll find useful versions of the STL

algorithms that operate on LLVM containers, such as `llvm::all_of

<https://llvm.org/doxygen/STLExtras_8h.html#func-members>`_.

Clang is implemented on top of LLVM and introduces its own set of classes that you

will interact with while writing your check. When a check issues diagnostics and

aaron.ballmanUnsubmitted

Done

I'd argue the most important thing you interact with from Clang are the AST nodes. Maybe instead of "most important", we can be vague and just say "Some commonly used features are" or something like that?

aaron.ballman: I'd argue the most important thing you interact with from Clang are the AST nodes. Maybe…

fix-its, these are associated with locations in the source code. Source code locations,

source files, ranges of source locations and the `SourceManager

<https://clang.llvm.org/doxygen/classclang_1_1SourceManager.html>`_ class provide

the mechanisms for describing such locations. These and

Eugene.ZelenkoUnsubmitted

Done

Clang.

Eugene.Zelenko: Clang.

other topics are described in the `"Clang" CFE Internals Manual

<https://clang.llvm.org/docs/InternalsManual.html>`_. Whereas the doxygen generated

documentation serves as a reference to the internals of Clang, this document serves

as a guide to other developers. Topics in that manual of interest to a check developer

are:

- `The Clang "Basic" Library

<https://clang.llvm.org/docs/InternalsManual.html#the-clang-basic-library>`_ for

information about diagnostics, fix-it hints and source locations.

- `The Lexer and Preprocessor Library

<https://clang.llvm.org/docs/InternalsManual.html#the-lexer-and-preprocessor-library>`_

for information about tokens, lexing (transforming characters into tokens) and the

preprocessor.

- `The AST Library

<https://clang.llvm.org/docs/InternalsManual.html#the-lexer-and-preprocessor-library>`_

for information about how C++ source statements are represented as an abstract syntax

tree (AST).

Most checks will interact with C++ source code via the AST. Some checks will interact

with the preprocessor. The input source file is lexed and preprocessed and then parsed

into the AST. Once the AST is fully constructed, the check is run by applying the check's

registered AST matchers against the AST and invoking the check with the set of matched

nodes from the AST. Monitoring the actions of the preprocessor is detached from the

AST construction, but a check can collect information during preprocessing for later

use by the check when nodes are matched by the AST.

Every syntactic (and sometimes semantic) element of the C++ source code is represented by

different classes in the AST. You select the portions of the AST you're interested in

by composing AST matcher functions. You will want to study carefully the `AST Matcher

Reference <https://clang.llvm.org/docs/LibASTMatchersReference.html>`_ to understand

LegalizeAdulthoodAuthorUnsubmitted

Done

Should be functions

LegalizeAdulthood: Should be `functions`

the relationship between the different matcher functions.

Using the Transformer library

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

The Transformer library allows you to write a check that transforms source code by

expressing the transformation as a ``RewriteRule``. The Transformer library provides

functions for composing edits to source code to create rewrite rules. Unless you need

to perform low-level source location manipulation, you may want to consider writing your

check with the Transformer library. The `Clang Transformer Tutorial

<https://clang.llvm.org/docs/ClangTransformerTutorial.html>`_ describes the Transformer

library in detail.

To use the Transformer library, make the following changes to the code generated by

the ``add_new_check.py`` script:

- Include ``../utils/TransformerClangTidyCheck.h`` instead of ``../ClangTidyCheck.h``

- Change the base class of your check from ``ClangTidyCheck`` to ``TransformerClangTidyCheck``

- Delete the override of the ``registerMatchers`` and ``check`` methods in your check class.

- Write a function that creates the ``RewriteRule`` for your check.

- Call the function in your check's constructor to pass the rewrite rule to

aaron.ballmanUnsubmitted

Done

I think this documentation is really good, but at the same time, I don't think we have any clang-tidy checks that make use of the transformer library currently. I don't see a reason we shouldn't document this more prominently, but I'd like to hear from @ymandel and/or @alexfh whether they think the library is ready for officially supported use within clang-tidy and whether we need any sort of broader community discussion. (I don't see any issues here, just crossing t's and dotting i's in terms of process.)

aaron.ballman: I think this documentation is really good, but at the same time, I don't think we have any…

LegalizeAdulthoodAuthorUnsubmitted

Done

There are at least two checks that use the Transformer library currently.

LegalizeAdulthood: There are at least two checks that use the Transformer library currently.

aaron.ballmanUnsubmitted

Done

The only mentions of TransformerClangTidyCheck.h that I can find are in ClangTransformerTutorial.rst and clang-formatted-files.txt, so if there are other checks using this functionality, they're not following what's documented here.

aaron.ballman: The only mentions of `TransformerClangTidyCheck.h` that I can find are in…

LegalizeAdulthoodAuthorUnsubmitted

Done

CleanupCtadCheck, StringFindStrContainsCheck, and StringviewNullptrCheck all derive from TransformerClangTidyCheck.

The first two are in the abseil module and the last is in the bugprone module.

LegalizeAdulthood: `CleanupCtadCheck`, `StringFindStrContainsCheck`, and `StringviewNullptrCheck` all derive from…

aaron.ballmanUnsubmitted

Done

Oh, interesting, thanks for pointing that out! It turns out that that clang and clang-tools-extra are different sibling directories and searching clang for things you expect to find in clang-tools-extra is not very helpful. :-D

My concerns are no longer a concern here.

aaron.ballman: Oh, interesting, thanks for pointing that out! It turns out that that `clang` and `clang-tools…

ymandelUnsubmitted

Done

My 2cents: definitely comfortable encouraging adoption. In fact, we require it on internal checks unless the user has a strong reason to opt out.

ymandel: My 2cents: definitely comfortable encouraging adoption. In fact, we require it on internal…

``TransformerClangTidyCheck``'s constructor.

Developing your check incrementally

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

The best way to develop your check is to start with the simple test cases and increase

complexity incrementally. The test file created by the ``add_new_check.py`` script is

a starting point for your test cases. A rough outline of the process looks like this:

- Write a test case for your check.

- Prototype matchers on the test file using :program:`clang-query`.

- Capture the working matchers in the ``registerMatchers`` method.

- Issue the necessary diagnostics and fix-its in the ``check`` method.

- Add the necessary ``CHECK-MESSAGES`` and ``CHECK-FIXES`` annotations to your

test case to validate the diagnostics and fix-its.

- Build the target ``check-clang-tool`` to confirm the test passes.

- Repeat the process until all aspects of your check are covered by tests.

The quickest way to prototype your matcher is to use :program:`clang-query` to

interactively build up your matcher. For complicated matchers, build up a matching

expression incrementally and use :program:`clang-query`'s ``let`` command to save named

matching expressions to simplify your matcher. Just like breaking up a huge function

LegalizeAdulthoodAuthorUnsubmitted

Done

chunks plural, "an intention revealing name" singular mismatch

LegalizeAdulthood: chunks plural, "an intention revealing name" singular mismatch

into smaller chunks with intention-revealing names can help you understand a complex

Eugene.ZelenkoUnsubmitted

Done

API.

Eugene.Zelenko: API.

aaron.ballmanUnsubmitted

Done

matching expressions to simplify your matcher. Just like breaking up a huge function

- into smaller chunks with intention revealing names can help you understand a complex

- algorithm, breaking up a matcher into smaller matchers with intention revealing names

+ into smaller chunks with intention-revealing names can help you understand a complex

+ algorithm, breaking up a matcher into smaller matchers with intention-revealing names

can help you understand a complicated matcher. Once you have a working matcher, the

aaron.ballman:

algorithm, breaking up a matcher into smaller matchers with intention-revealing names

can help you understand a complicated matcher. Once you have a working matcher, the

C++ API will be virtually identical to your interactively constructed matcher. You can

aaron.ballmanUnsubmitted

Done

C++ API will be virtually identical to your interactively constructed matcher. You can

- use local variables to preserve your intention revealing names that you applied to

+ use local variables to preserve your intention-revealing names that you applied to

nested matchers.

aaron.ballman:

use local variables to preserve your intention-revealing names that you applied to

nested matchers.

Creating private matchers

^^^^^^^^^^^^^^^^^^^^^^^^^

Sometimes you want to match a specific aspect of the AST that isn't provided by the

existing AST matchers. You can create your own private matcher using the same

infrastructure as the public matchers. A private matcher can simplify the processing

in your ``check`` method by eliminating complex hand-crafted AST traversal of the

matched nodes. Using the private matcher allows you to select the desired portions

of the AST directly in the matcher and refer to it by a bound name in the ``check``

method.

Unit testing helper code

^^^^^^^^^^^^^^^^^^^^^^^^

Private custom matchers are a good example of auxiliary support code for your check

aaron.ballmanUnsubmitted

Done

Do we have any private matchers that are unit tested? My understanding was that the public matchers were all unit tested, but that the matchers which are local to a check are not exposed via a header file, and so they're not really unit testable.

aaron.ballman: Do we have any private matchers that are unit tested? My understanding was that the public…

LegalizeAdulthoodAuthorUnsubmitted

Done

I just did this for my switch/case/label update to simplify boolean expressions.
You do have to expose them via a header file, which isn't a big deal.

LegalizeAdulthood: I just did this for my switch/case/label update to simplify boolean expressions. You do have to…

aaron.ballmanUnsubmitted

Done

You do have to expose them via a header file, which isn't a big deal.

Then they become part of the public interface for the check and anything which includes the header file has to do heavy template instantiation work that contributes more symbols to the built library. In general, we don't expect private matchers to be unit tested -- we expect the tests for the check to exercise the matcher appropriately.

aaron.ballman: > You do have to expose them via a header file, which isn't a big deal. Then they become part…

LegalizeAdulthoodAuthorUnsubmitted

Done

Look at my review to see how I handled it; the matchers are in a seperate
header file, in my case SimplifyBooleanExprMatchers.h and aren't exposed
to consumers of the check. The matchers header is only included by the
implementation and the matcher tests.

LegalizeAdulthood: Look at my review to see how I handled it; the matchers are in a seperate header file, in my…

aaron.ballmanUnsubmitted

Done

Thanks for pointing out how you're doing it in one of your checks -- I still don't think we should document that we expect people to unit test private matchers unless clang-tidy reviewers are on board with the idea in general. IMO, that's putting a burden on check authors and all the reviewers to do something that's never been suggested before (let alone documented as a best practice) -- that's worthy of open discussion instead of adding it to a patch intended to document current practices.

aaron.ballman: Thanks for pointing out how you're doing it in one of your checks -- I still don't think we…

ymandelUnsubmitted

Done

Agreed on this point -- that we should push this off to discussion on a separate patch. I'm definitely fine with pointing out that unit testing is possible, since that may not be obvious. But, I'd have to be convinced that we want to insist on it.

ymandel: Agreed on this point -- that we should push this off to discussion on a separate patch. I'm…

that can be tested with a unit test. It will be easier to test your matchers or

LegalizeAdulthoodAuthorUnsubmitted

Done

If I change the wording from "is best tested with a unit test" to "can be tested with a unit test",
would that alleviate the concern? I want to encourage appropriate testing and unit testing
complex helper code is better than integration testing helper code.

I find it easier to have confidence in private matchers if they are unit tested and I've recently
had a few situations where I had to write relatively complex helper functions to analyze raw
text that I felt would have been better tested with a unit test than an integration test.

LegalizeAdulthood: If I change the wording from "is best tested with a unit test" to "can be tested with a unit…

aaron.ballmanUnsubmitted

Done

If I change the wording from "is best tested with a unit test" to "can be tested with a unit test", would that alleviate the concern?

I largely addresses mine -- saying it can be done is great, saying it should be done is what gave me pause.

aaron.ballman: > If I change the wording from "is best tested with a unit test" to "can be tested with a unit…

ymandelUnsubmitted

Done

ymandel: +1

other support classes by writing a unit test than by writing a ``FileCheck`` integration

test. The ``ASTMatchersTests`` target contains unit tests for the public AST matcher

classes and is a good source of testing idioms for matchers.

Making your check robust

^^^^^^^^^^^^^^^^^^^^^^^^

Once you've covered your check with the basic "happy path" scenarios, you'll want to

aaron.ballmanUnsubmitted

Done

Once you've covered your check with the basic "happy path" scenarios, you'll want to

- torture your check with some crazy C++ in order to ensure your check is robust. Running

+ torture your check with as many edge cases as you can cover in order to ensure your check is robust. Running

your check on a large code base, such as Clang/LLVM, is a good way to catch things you

Reworded to avoid a loaded term and not make it sound C++ specific; needs re-flowing to 80-col limits

aaron.ballman: Reworded to avoid a loaded term and not make it sound C++ specific; needs re-flowing to 80-col…

torture your check with as many edge cases as you can cover in order to ensure your

check is robust. Running your check on a large code base, such as Clang/LLVM, is a

good way to catch things you forgot to account for in your matchers. However, the

aaron.ballmanUnsubmitted

Done

your check on a large code base, such as Clang/LLVM, is a good way to catch things you

- forgot to account for in your matchers. However, the LLVM code base is "reasonable" and

- doesn't contain weird template or macro oriented code.

+ forgot to account for in your matchers. However, the LLVM code base may be insufficient for testing purposes as it was developed against a particular set of coding styles and quality measures. The larger the corpus of code the check is tested against, the higher confidence the community will have in the check's efficacy and false positive rate.

Some suggestions to ensure your check is robust:

aaron.ballman:

LLVM code base may be insufficient for testing purposes as it was developed against a

particular set of coding styles and quality measures. The larger the corpus of code

the check is tested against, the higher confidence the community will have in the

check's efficacy and false positive rate.

Some suggestions to ensure your check is robust:

- Create header files that contain code matched by your check.

- Validate that fix-its are properly applied to test header files with

:program:`clang-tidy`. You will need to perform this test manually until

aaron.ballmanUnsubmitted

Done

Another one that catches people out is testing on Windows vs non-Windows targets (as targets which are compatible with MSVC default to a different template instantiation behavior).

One key thing that's not discussed explicitly are false positive rates. Clang-tidy can have significantly higher false positive rates than diagnostics in Clang or the static analyzer, but we still care about not being overly chatty. But "overly chatty" depends on the check -- if the check is for a coding standard and the coding standard says "no calls to 'foo()'", then it's not chatty to diagnose every call to "foo()". But if the check is not following a coding standard, but is instead trying to help someone modernize their code, find bugs, or make it more readable (as examples), then perhaps diagnosing every call to "foo()" will be too chatty. So we ask people to test their code against real world code bases to try to gauge what the false positive and true positive rates are for the check, just to make sure they seem reasonable.

Another thing we may want to mention is that checks can be configured. This helps not only with exposing different functionality for the check, but also can help to control perceived false positives for some use cases.

aaron.ballman: Another one that catches people out is testing on Windows vs non-Windows targets (as targets…

automated support for checking messages and fix-its is added to the

``check_clang_tidy.py`` script.

- Define macros that contain code matched by your check.

Eugene.ZelenkoUnsubmitted

Done

Link to Release Notes?

Eugene.Zelenko: Link to Release Notes?

- Define template classes that contain code matched by your check.

- Define template specializations that contain code matched by your check.

- Test your check under both Windows and Linux environments.

- Watch out for high false positive rates. Ideally, a check would have no false

ymandelUnsubmitted

Done

Is it worth giving a rule-of-thumb? Personally I'd go with < 10%, all things being equal. A check for a serious bug may reasonably have a higher false positive rate, and trivial checks might not justify *any* false positives. But, if neither of these apply, I'd recommend 10% as the default.

ymandel: Is it worth giving a rule-of-thumb? Personally I'd go with < 10%, all things being equal. A…

LegalizeAdulthoodAuthorUnsubmitted

Done

I'm OK with rule-of-thumb 10% advice.

LegalizeAdulthood: I'm OK with rule-of-thumb 10% advice.

aaron.ballmanUnsubmitted

Done

FWIW, I think 10% is pretty arbitrary and I'd rather not see us try to nail it down to a concrete number. In practical terms, it really depends on the check.

Also, clang-tidy is where we put things with a "high" false positive rate already, so this statement has implications on what an acceptable false positive rate is for Clang or the static analyzer.

How about something along these lines:

- Watch out for high false positive rates. Ideally, a check would have no false positives, but given that matching against an AST is not control- or data flow-sensitive, a number of false positives are expected. The higher the false positive rate, the less likely the check will be adopted in practice. Mechanisms should be put in place to help the user manage false positives.
- There are two primary mechanisms for managing false positives: supporting a code pattern which allows the programmer to silence the diagnostic in an ad hoc manner and check configuration options to control the behavior of the check.
- Consider supporting a code pattern to allow the programmer to silence the diagnostic whenever such a code pattern can clearly express the programmer's intent. For example, allowing an explicit cast to `void` to silence an unused variable diagnostic.
- Consider adding check configuration options to allow the user to opt into more aggressive checking behavior without burdening users for the common high-confidence cases.

aaron.ballman: FWIW, I think 10% is pretty arbitrary and I'd rather not see us try to nail it down to a…

ymandelUnsubmitted

Done

Strongly agree. 10% has served as well in practice for the threshold at which we disable/fix checks, but it's definitely arbitrary. I much prefer your suggested approach.

ymandel: Strongly agree. 10% has served as well in practice for the threshold at which we disable/fix…

LegalizeAdulthoodAuthorUnsubmitted

Done

Yeah, I wasn't a fan of a magic number style piece of advice. I like the reworded suggestion better.

LegalizeAdulthood: Yeah, I wasn't a fan of a magic number style piece of advice. I like the reworded suggestion…

positives, but given that matching against an AST is not control- or data flow-

sensitive, a number of false positives are expected. The higher the false

positive rate, the less likely the check will be adopted in practice.

Mechanisms should be put in place to help the user manage false positives.

- There are two primary mechanisms for managing false positives: supporting a

code pattern which allows the programmer to silence the diagnostic in an ad

hoc manner and check configuration options to control the behavior of the check.

- Consider supporting a code pattern to allow the programmer to silence the

diagnostic whenever such a code pattern can clearly express the programmer's

intent. For example, allowing an explicit cast to ``void`` to silence an

unused variable diagnostic.

- Consider adding check configuration options to allow the user to opt into

more aggressive checking behavior without burdening users for the common

high-confidence cases.

Documenting your check

^^^^^^^^^^^^^^^^^^^^^^

The ``add_new_check.py`` script creates entries in the

`release notes <https://clang.llvm.org/extra/ReleaseNotes.html>`_, the list of

checks and a new file for the check documentation itself. It is recommended that you

have a concise summation of what your check does in a single sentence that is repeated

in the release notes, as the first sentence in the doxygen comments in the header file

for your check class and as the first sentence of the check documentation. Avoid the

phrase "this check" in your check summation and check documentation.

If your check relates to a published coding guideline (C++ Core Guidelines, MISRA, etc.)

or style guide, provide links to the relevant guideline or style guide sections in your

check documentation.

Provide enough examples of the diagnostics and fix-its provided by the check so that a

user can easily understand what will happen to their code when the check is run.

If there are exceptions or limitations to your check, document them thoroughly. This

will help users understand the scope of the diagnostics and fix-its provided by the check.

Building the target ``docs-clang-tools-html`` will run the Sphinx documentation generator

and create documentation HTML files in the tools/clang/tools/extra/docs/html directory in

your build tree. Make sure that your check is correctly shown in the release notes and the

list of checks. Make sure that the formatting and structure of your check's documentation

looks correct.

Registering your Check Registering your Check

---------------------- ----------------------

(The ``add_new_check.py`` takes care of registering the check in an existing (The ``add_new_check.py`` script takes care of registering the check in an existing

module. If you want to create a new module or know the details, read on.) module. If you want to create a new module or know the details, read on.)

The check should be registered in the corresponding module with a distinct name: The check should be registered in the corresponding module with a distinct name:

.. code-block:: c++ .. code-block:: c++

class MyModule : public ClangTidyModule { class MyModule : public ClangTidyModule {

public: public:

▲ Show 20 Lines • Show All 80 Lines • ▼ Show 20 Lines

.. code-block:: console .. code-block:: console

$ clang-tidy -config="{CheckOptions: [{key: a, value: b}, {key: x, value: y}]}" ... $ clang-tidy -config="{CheckOptions: [{key: a, value: b}, {key: x, value: y}]}" ...

Testing Checks Testing Checks

-------------- --------------

To run tests for :program:`clang-tidy` use the command: To run tests for :program:`clang-tidy`, build the ``check-clang-tools`` target.

For instance, if you configured your CMake build with the ninja project generator,

use the command:

.. code-block:: console .. code-block:: console

$ ninja check-clang-tools $ ninja check-clang-tools

:program:`clang-tidy` checks can be tested using either unit tests or :program:`clang-tidy` checks can be tested using either unit tests or

`lit`_ tests. Unit tests may be more convenient to test complex replacements `lit`_ tests. Unit tests may be more convenient to test complex replacements

with strict checks. `Lit`_ tests allow using partial text matching and regular with strict checks. `Lit`_ tests allow using partial text matching and regular

▲ Show 20 Lines • Show All 61 Lines • ▼ Show 20 Lines .. code-block:: c++

... ...

// CHECK-MESSAGES-USING-A: :[[@LINE-8]]:10: warning: using decl 'A' {{.*}} // CHECK-MESSAGES-USING-A: :[[@LINE-8]]:10: warning: using decl 'A' {{.*}}

// CHECK-MESSAGES-USING-B: :[[@LINE-7]]:10: warning: using decl 'B' {{.*}} // CHECK-MESSAGES-USING-B: :[[@LINE-7]]:10: warning: using decl 'B' {{.*}}

// CHECK-MESSAGES: :[[@LINE-6]]:10: warning: using decl 'C' {{.*}} // CHECK-MESSAGES: :[[@LINE-6]]:10: warning: using decl 'C' {{.*}}

// CHECK-FIXES-USING-A-NOT: using a::A;$ // CHECK-FIXES-USING-A-NOT: using a::A;$

// CHECK-FIXES-USING-B-NOT: using a::B;$ // CHECK-FIXES-USING-B-NOT: using a::B;$

// CHECK-FIXES-NOT: using a::C;$ // CHECK-FIXES-NOT: using a::C;$

There are many dark corners in the C++ language, and it may be difficult to make There are many dark corners in the C++ language, and it may be difficult to make

your check work perfectly in all cases, especially if it issues fix-it hints. The your check work perfectly in all cases, especially if it issues fix-it hints. The

most frequent pitfalls are macros and templates: most frequent pitfalls are macros and templates:

1. code written in a macro body/template definition may have a different meaning 1. code written in a macro body/template definition may have a different meaning

depending on the macro expansion/template instantiation; depending on the macro expansion/template instantiation;

2. multiple macro expansions/template instantiations may result in the same code 2. multiple macro expansions/template instantiations may result in the same code

being inspected by the check multiple times (possibly, with different being inspected by the check multiple times (possibly, with different

meanings, see 1), and the same warning (or a slightly different one) may be meanings, see 1), and the same warning (or a slightly different one) may be

issued by the check multiple times; :program:`clang-tidy` will deduplicate issued by the check multiple times; :program:`clang-tidy` will deduplicate

_identical_ warnings, but if the warnings are slightly different, all of them _identical_ warnings, but if the warnings are slightly different, all of them

will be shown to the user (and used for applying fixes, if any); will be shown to the user (and used for applying fixes, if any);

3. making replacements to a macro body/template definition may be fine for some 3. making replacements to a macro body/template definition may be fine for some

macro expansions/template instantiations, but easily break some other macro expansions/template instantiations, but easily break some other

expansions/instantiations. expansions/instantiations.

If you need multiple files to exercise all the aspects of your check, it is

recommended you place them in a subdirectory named for the check under ``Inputs``.

This keeps the test directory from getting cluttered.

.. _lit: https://llvm.org/docs/CommandGuide/lit.html .. _lit: https://llvm.org/docs/CommandGuide/lit.html

.. _FileCheck: https://llvm.org/docs/CommandGuide/FileCheck.html .. _FileCheck: https://llvm.org/docs/CommandGuide/FileCheck.html

.. _test/clang-tidy/google-readability-casting.cpp: https://reviews.llvm.org/diffusion/L/browse/clang-tools-extra/trunk/test/clang-tidy/google-readability-casting.cpp .. _test/clang-tidy/google-readability-casting.cpp: https://reviews.llvm.org/diffusion/L/browse/clang-tools-extra/trunk/test/clang-tidy/google-readability-casting.cpp

Running clang-tidy on LLVM Running clang-tidy on LLVM

-------------------------- --------------------------

▲ Show 20 Lines • Show All 92 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[clang-tidy] Add more documentation about check developmentClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 403677

clang-tools-extra/docs/clang-tidy/Contributing.rst

[clang-tidy] Add more documentation about check development
ClosedPublic