This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
clang-tidy/tool/
-
tool/
-
run-clang-tidy.py
1
run_clang_tidy.py
-
test_input/
-
out_csa_cmake.log
-
out_performance_cmake.log
-
test_log_parser.py
-
docs/
-
ReleaseNotes.rst

Differential D54141

[clang-tidy] add deduplication support for run-clang-tidy.py
AbandonedPublic

Authored by JonasToth on Nov 6 2018, 2:19 AM.

Download Raw Diff

Details

Reviewers

alexfh
aaron.ballman
hokein
sammccall
serge-sans-paille
lebedev.ri

Summary

run-clang-tidy.py is the parallel executor for clang-tidy. Due to the
common header-inclusion problem in C++/C diagnostics that are usually emitted
in class declarations are emitted every time their corresponding header is
included.

This results in a *VERY* high amount of spam and renders the output basically
useles for bigger projects.
With this patch run-clang-tidy.py gets another option that enables
deduplication of all emitted diagnostics(by default off). This is achieved with parsing the
diagnostic output from each clang-tidy invocation, identifying warnings and
error and parsing until the next occurrence of an error or warning. The collected
diagnostic is hashed and stored in a set. Every new diagnostic will only be
emitted if its hash is not in the set already.

Numbers to show the issue

I am currently creating a buildbot for running clang-tidy over real world projects. Some experience comes from there, I reproduced one specific case for this test. It is not made up and not even the worst I could see.

Running clang-tidys misc-module over llvm/lib:

/fast_data2/llvm/tools/clang/tools/extra/clang-tidy/tool/run-clang-tidy.py \
    -checks=-*,misc-* \
    -header-filter=".*" \
    -clang-tidy-binary /fast_data2/llvm/build_clang_fast/bin/clang-tidy \
    -fix \
    lib/ \
    2>/fast_data2/rct_dedup_lib.err.misc \
    1>/fast_data2/rct_dedup_lib.out.misc

produces over 300MB of diagnostic output. The run-clang-tidy.py script consumes up to 0.8%*32GB of RAM on my machine.

373K Nov  5 22:48 rct_lib.err.misc
306M Nov  5 22:48 rct_lib.out.misc

Doing the same analysis but with -deduplication enabled results in 5.4MB of diagnostic output (two orders of magnitude less!) and run-clang-tidy.py only consumes up to 0.5%*32GB of RAM.

373K Nov  5 23:13 rct_dedup_lib.err.misc
5,4M Nov  5 23:13 rct_dedup_lib.out.misc

Notes

The difference in RAM usage for the run-clang-tidy.py script seems suspicious as one would expect the duplication overhead should need more RAM as only printing the stuff out.
It might be a memory leak in the script of some other effect. To my surprise we are better of deduplicating. I did not measure run-time differences but I suspect they decrease as well, as piping hundreds of MB through stdout in python is probably slower.

I found multiple checks that are specifically prone to producing *A LOT* of spam, e.g. bugprone-macro-parentheses. I did statistics in my buildbot where the spammy checks easily had 100x times the output then they needed to have (consistent with the finding in the llvm/lib example).
Running modules with spam-prone checks over the whole of LLVM resulted in ~GB of log-output. I could measure more because my buildbot just refused to give me the full log-files.

Correctness

I did check against a grep "warning: " | sort | uniq -c | sort -n -r output for the log-files. They showed every diagnostic in the deduplicated output occured exactly once.
The hashing is done with SHA256 with is considered to be secure, so there are no collision expected. For this use-case MD5 might even be viable, but by inspecting htop output
the 16 cores of my machine were all fully loaded, so there doesn't seem to be a performance issue from to slow hashing or similar (the parsing is done within the lock, so no parallelization there!).

Diff Detail

Repository

rCTE Clang Tools Extra

Build Status

Buildable 24595
Build 24594: arc lint + arc unit

Event Timeline

JonasToth created this revision.Nov 6 2018, 2:19 AM

Herald added subscribers: cfe-commits, xazax.hun, mgorny. · View Herald TranscriptNov 6 2018, 2:19 AM

Harbormaster completed remote builds in B24593: Diff 172724.Nov 6 2018, 2:19 AM

spurious change in my git?

Harbormaster completed remote builds in B24595: Diff 172726.Nov 6 2018, 2:21 AM

JonasToth edited the summary of this revision. (Show Details)Nov 6 2018, 2:41 AM

JonasToth added a project: Restricted Project.

JonasToth added inline comments.

clang-tidy/tool/run_clang_tidy.py
1	This simlink is required for my unittests, I don't know how to add the added tests in the `lit` test-suite so there is no change there yet. A bit of guidance there would be nice :)

Thanks for the patch and nice improvements.

Some initial thoughts:

The output of clang-tidy diagnostic is YAML, and YAML is not an space-efficient format (just for human readability). If you want to save space further, we might consider using some compressed formats, e.g. llvm::bitcode. Given the reduced YAML result (5.4MB) is promising, this might not matter.
clang-tidy itself doesn't do deduplication, and run-clang-tidy.py seems an old way of running clang-tidy in parallel. The python script seems become more complicated now. We have AllTUsToolExecutor right now, which supports running clang tools on a compilation database in parallel, so another option would be to use AllTUsToolExecutor in clang-tidy, and we can do deduplication inside clang-tidy binary (in reduce phase), which should be faster than the python script (spawn new clang-tidy processes and do round-trip of all the data through YAML-on-disk).

This feature seems like a good idea. I started writing it too some months ago, but then I changed tactic and worked on distributing the refactor over the network instead. As far as I know, your deduplication would not work with a distributed environment.

However, it seems that both features can exist.

You use a regex to parse the clang output. Why not use the already-machine-readable yaml output and de-duplicate based on that? I think the design would be something like:

Run clang-tidy in a quiet mode which only exports yaml and does not issue diagnostics
Read the yaml in your python script
Add the entries to your already-seen cache
For any entry which was not already there
- Write the entries to a new yaml file
- Use clang-apply-replacements --issue-diags the_new_file.yaml to actually cause the new diagnostics to be issued (they were omitted from the clang-tidy run).

This avoids fragile parsing of the output from clang, instead relying on the machine-readable format.

I think clang-apply-replacements already does de-duplication, so it's possible that could take more responsibility.

Also, I think your test content is too big. I suggest trying to write more contained tests for this.

The output of clang-tidy diagnostic is YAML, and YAML is not an space-efficient format (just for human readability). If you want to save space further, we might consider using some compressed formats, e.g. llvm::bitcode. Given the reduced YAML result (5.4MB) is promising, this might not matter.

The output were normal diagnostics written to stdout, deduplication happens from there (see the test-cases). The files i created were just through piping to filter some of the noise.
Without de-duplication its very hard to get something useful out of a run with many checks activated for bigger projects (e.g. Blender and OpenCV are useless to try, because they have some commonly used macros with a check-violation. The buildbot filled 30GB of RAM before it crashed and couldn't even finish the analysis of the project. Similar for LLVM).

I would like to try the simple deduplication first and see if space is still an issue. After all I want to just read the diagnostic and see whats happening instantly and a more compressed format might not help there.

clang-tidy itself doesn't do deduplication, and run-clang-tidy.py seems an old way of running clang-tidy in parallel. The python script seems become more complicated now. We have AllTUsToolExecutor right now, which supports running clang tools on a compilation database in parallel, so another option would be to use AllTUsToolExecutor in clang-tidy, and we can do deduplication inside clang-tidy binary (in reduce phase), which should be faster than the python script (spawn new clang-tidy processes and do round-trip of all the data through YAML-on-disk).

I agree that AllTUsToolExecutor would be better instead of the python script, but i think getting this done takes longer, then just patching the script now. From the patch here (it is an by-default off option as well) it is easier to test all pieces of clang-tidy. From there we can easily migrate to something better then `run-clang-tidy.py´.

The deduplication within clang-tidy would be the best option! But for full deduplication the parallelization must happen first.

The python script seems become more complicated now.

A bit, yes. The actual calling of clang-tidy and other parts are not touched. Just the parser adds additional complexity, which is covered in the unit tests. I don't think this solution lives for ever, but its fast and effective. Its optional and by default off.

Thank you for the comment!

In D54141#1288809, @steveire wrote:

This feature seems like a good idea. I started writing it too some months ago, but then I changed tactic and worked on distributing the refactor over the network instead. As far as I know, your deduplication would not work with a distributed environment.

I agree that it would probably not work. It might enable a two-stage deduplication, but I don't know if that would be viable.

However, it seems that both features can exist.

You use a regex to parse the clang output. Why not use the already-machine-readable yaml output and de-duplicate based on that? I think the design would be something like:

Run clang-tidy in a quiet mode which only exports yaml and does not issue diagnostics

Read the yaml in your python script

Add the entries to your already-seen cache

For any entry which was not already there

Write the entries to a new yaml file

Use clang-apply-replacements --issue-diags the_new_file.yaml to actually cause the new diagnostics to be issued (they were omitted from the clang-tidy run).

This avoids fragile parsing of the output from clang, instead relying on the machine-readable format.

In principle this approach seems more robust and I am not claiming my approach is robust at all :)
The point hokein raised should be considered first in my opinion. If clang-tidy itself is already parallel we should definitely deduplicate there. This is something I would put more
effort in. The proposed solution is more a hack to get my buildbot running and find transformation bugs and provide real-world data for checks we implement. :)

I think clang-apply-replacements already does de-duplication, so it's possible that could take more responsibility.

Yes, the emitted fixes are deduplicated but i think we need something even if no fixes are involved.

Also, I think your test content is too big. I suggest trying to write more contained tests for this.

I wanted to have a mix of both real snippets and some unit-tests on short examples. Do you think its enough if i shorten the list of fields that the CSA output contains for the padding checker?

In D54141#1288818, @JonasToth wrote:

Thank you for the comment!

In D54141#1288809, @steveire wrote:

This feature seems like a good idea. I started writing it too some months ago, but then I changed tactic and worked on distributing the refactor over the network instead. As far as I know, your deduplication would not work with a distributed environment.

I agree that it would probably not work. It might enable a two-stage deduplication, but I don't know if that would be viable.

Yes, I think the distributed refactoring would benefit from the design I outlined - issuing diagnostics from the yaml files.

In principle this approach seems more robust and I am not claiming my approach is robust at all :)
The point hokein raised should be considered first in my opinion. If clang-tidy itself is already parallel we should definitely deduplicate there. This is something I would put more
effort in. The proposed solution is more a hack to get my buildbot running and find transformation bugs and provide real-world data for checks we implement. :)

Yes, I think it makes sense to do something more robust. I understand you're yak shaving here a bit while trying to reach a higher goal.

The AllTUsToolExecutor idea is worth exploring - it would mean we could remove the threading from run-clang-tidy.py.

I don't think we should get a self-confessed hack in just because it's already written.

However, AllTUsToolExecutor seems to not create output replacement files at all, which is not distributed-refactoring-friendly.

I think clang-apply-replacements already does de-duplication, so it's possible that could take more responsibility.

Yes, the emitted fixes are deduplicated but i think we need something even if no fixes are involved.

Maybe my suggestion was not clear. The yaml file generated by clang-tidy contains not only replacements, but all diagnostics, even without a fixit.

So, running clang-apply-replacements --issue-diags the_new_file.yaml would issue the warnings/fixit hints by processing the yaml and issuing the diagnostics the way clang-tidy would have done (See in my proposed design that we silence clang-tidy).

Note also that the --issue-diags option does not yet exist. I'm proposing adding it.

Also, I think your test content is too big. I suggest trying to write more contained tests for this.

I wanted to have a mix of both real snippets and some unit-tests on short examples. Do you think its enough if i shorten the list of fields that the CSA output contains for the padding checker?

It seems that the bulk of the testing part of this commit is parsing a real-world log that you made. I guess if you remove the parsing (by taking a machine-readable approach) that bulk will disappear anyway.

Maybe my suggestion was not clear. The yaml file generated by clang-tidy contains not only replacements, but all diagnostics, even without a fixit.

So, running clang-apply-replacements --issue-diags the_new_file.yaml would issue the warnings/fixit hints by processing the yaml and issuing the diagnostics the way clang-tidy would have done (See in my proposed design that we silence clang-tidy).

Note also that the --issue-diags option does not yet exist. I'm proposing adding it.

At the moment clang-apply-replacements is called at the end of an clang-tidy run in run-clang-tidy.py That means we produce ~GBs of Yaml first, to then emit ~10MBs worth of it.
I think just relying on clang-apply-replacements is not ok.
If we do a hybrid: on-the-fly deduplication within clang-tidy/run-clang-tidy.py and a potential final deduplication with clang-apply-replacments we get the best of both worlds.
That fits the distributed use-case as well(? I don't use a distributed system for these things as my projects are too small), because the first stage is local, the second stage central after all local workers are done.

It seems that the bulk of the testing part of this commit is parsing a real-world log that you made. I guess if you remove the parsing (by taking a machine-readable approach) that bulk will disappear anyway.

That is true. The lat bit you have to convince me is on-the-flight output to see whats going on. I personally usually just take the raw textual representation and grep/scroll through it to see whats going on. It might be a bit of a tension between large-scale and small/medium scale applications.

In D54141#1288851, @JonasToth wrote:

So, running clang-apply-replacements --issue-diags the_new_file.yaml would issue the warnings/fixit hints by processing the yaml and issuing the diagnostics the way clang-tidy would have done (See in my proposed design that we silence clang-tidy).

Note also that the --issue-diags option does not yet exist. I'm proposing adding it.

At the moment clang-apply-replacements is called at the end of an clang-tidy run in run-clang-tidy.py That means we produce ~GBs of Yaml first, to then emit ~10MBs worth of it.
I think just relying on clang-apply-replacements is not ok.

I think my proposal is still unclear to you. Sorry about that.

I am proposing on-the-fly de-duplication, but without regex parsing of the diagnostic output of clang-tidy.

My proposal is still the same as I wrote before, but maybe what I write below will be more clear. Sorry if this seems condescendingly detailed. I don't know where the misunderstanding is coming from, so I err on the side of detail:

Imagine you have two cpp files which both include the same header. Imagine also that all 3 files are missing 1 override keyword and you run the modernize-use-override check on the two translation units using run-clang-tidy.py.

Here is what I propose happens:

First, assume that the two translation units are processed serially, just to simplify the process as described. You will see that parallelizing does not change anything.
clang-tidy gets run on file1.cpp in a way that it does not write diagnostics (and fixes) to stdout/stderr, but only generates a yaml file representing the diagnostics (warnings and fixes).
Two diagnostics are created - one for the missing override in file1.cpp and one for the missing override in shared_header.h
the on-the-fly deduplication cache is empty, so both diagnostics get added to the on-the-fly deduplication cache
Because both were added to the cache, both diagnostics get written to a temporary file foo.yaml
clang-appy-replacements --issue-diags foo.yaml is run (some other tool could be used for this step, but CAR seems right to me)
clang-appy-replacements --issue-diags foo.yaml causes the two diagnostics to be issued, exactly as they would have been issued by clang-tidy.
clang-appy-replacements --issue-diags foo.yaml DOES NOT actually change the source code. It only emits diagnostics
Next process file2.cpp
Processing file2.cpp results in two diagnostics - one for the missing override in file2.cpp and one for the missing override in shared_header..h
NOTE: The diagnostic for the missing override in shared_header.h is a duplicate
NOTE: The diagnostic for the missing override in file2.cpp is NOT a duplicate
Try to add both to the on-the-fly deduplication cache
Discover that the diagnostic for file2.cpp IS NOT a duplicate. Add it to a tempoary bar.yaml (named not to conflict with any other temporary file! This is built into temporary file APIs).
Discover that the diagnostic for shared_header.h IS a duplicate. DO NOT write it to the temporary file
Run clang-appy-replacements --issue-diags bar.yaml
Run clang-appy-replacements --issue-diags bar.yaml causes ONLY the diagnostic for file2.cpp to be issued because that is all that is in the file.
At the end, you have a de-duplicated yaml structure in memory. Write it to the file provided to the --export-fixes parameter of run-clang-tidy.py.

Do you understand the proposal now?

This means that you get

A deduplicated fixes file
De-duplicated diagnostics issued - which means you can process them in your CI system etc.

That fits the distributed use-case as well?

If the distributed system processes more than one file at a time on the remote computer.

I'm not aware of any distributed systems that work that way. I think they all process single files at a time.

However, when the resulting yaml files are sent back to your computer, your computer can deduplicate the diagnostics issued (in my design).

Do you understand the proposal now?

Yes better, I was under the impression that clang-apply-replaments is run on the end and the YAMLs are kept until then. Now its clear.
I assume --issue-diags produce the same result as the normal diagnostic engine. That could work, yes.

clang-tidy does not have a quiet mode though. It has the -quiet option which just does not emit how many warnings were created and suppressed.
Do you have these things already in the pipeline?

Reducing log file size is good idea, but I think will be also good idea to count duplicates. This will allow to concentrate clean-up efforts on place where most of warnings originate.

In D54141#1288993, @Eugene.Zelenko wrote:

Reducing log file size is good idea, but I think will be also good idea to count duplicates. This will allow to concentrate clean-up efforts on place where most of warnings originate.

Places that emit a lot of diagnostics, still do. I think the amount of duplicated warnings does not show an urgency with the unique warning.

In D54141#1288930, @JonasToth wrote:

Do you understand the proposal now?

Yes better, I was under the impression that clang-apply-replaments is run on the end and the YAMLs are kept until then. Now its clear.
I assume --issue-diags produce the same result as the normal diagnostic engine. That could work, yes.

clang-tidy does not have a quiet mode though. It has the -quiet option which just does not emit how many warnings were created and suppressed.
Do you have these things already in the pipeline?

Please let me clarify a bit: -export-fixes _DOES_ emit all warnings to yaml, but clang-tidy still prints the diagnostics out, even in -quiet mode. So a useful deduplication would require changes to the clang-tidy itself if going the YAML route.

In D54141#1288930, @JonasToth wrote:

Do you understand the proposal now?

Yes better, I was under the impression that clang-apply-replaments is run on the end and the YAMLs are kept until then. Now its clear.
I assume --issue-diags produce the same result as the normal diagnostic engine. That could work, yes.

Great, glad we got that misunderstanding sorted out.

clang-tidy does not have a quiet mode though. It has the -quiet option which just does not emit how many warnings were created and suppressed.
Do you have these things already in the pipeline?

No, I have not started on these.

At least the clang-tidy quiet mode is trivial to implement. Maybe instead of --quiet we could have --stdout=<output_format> where output_format can be one of none, diag, yaml and in the future possibly json (requested here: http://lists.llvm.org/pipermail/cfe-dev/2018-October/059944.html) or cbor, to address the binary output suggestion from @hokein.

At least the clang-tidy quiet mode is trivial to implement. Maybe instead of --quiet we could have --stdout=<output_format> where output_format can be one of none, diag, yaml and in the future possibly json (requested here: http://lists.llvm.org/pipermail/cfe-dev/2018-October/059944.html) or cbor, to address the binary output suggestion from @hokein.

Yes, it would slighlty duplicate export-fixes but i think that is not a big issue.
I think the first steps would be using parallel execution within clang-tidy itself.
After that we can extract the deduplication logic from apply-replacements (if it is actually suitable!) or deduplicate in the diag() emitting phase. A thing we have to keep in mind, that deduplication must happen to the whole warning: blaaa\n note: blaa \n note: 'aslkdjad' here comes the only change between two different warnings construct.
If we don't group the diagnostics properly we have a bad time.

That said, would you agree to have the parser-based deduplication as an developer-only optin solution for now? :)

In D54141#1289326, @JonasToth wrote:

That said, would you agree to have the parser-based deduplication as an developer-only optin solution for now? :)

If you're suggesting proceeding with this regex based solution, I don't think that's a good idea. Why commit a hack which people will object to ever removing? Just see if we can do the right thing instead.

If you're suggesting proceeding with this regex based solution, I

don't think that's a good idea. Why commit a hack which people will object to ever removing? Just see if we can do the right thing instead.

+1, my main concern is the complexity of the patch and maintenance burden of the python script.

In D54141#1288811, @JonasToth wrote:

The output of clang-tidy diagnostic is YAML, and YAML is not an space-efficient format (just for human readability). If you want to save space further, we might consider using some compressed formats, e.g. llvm::bitcode. Given the reduced YAML result (5.4MB) is promising, this might not matter.

The output were normal diagnostics written to stdout, deduplication happens from there (see the test-cases). The files i created were just through piping to filter some of the noise.
Without de-duplication its very hard to get something useful out of a run with many checks activated for bigger projects (e.g. Blender and OpenCV are useless to try, because they have some commonly used macros with a check-violation. The buildbot filled 30GB of RAM before it crashed and couldn't even finish the analysis of the project. Similar for LLVM).

I would like to try the simple deduplication first and see if space is still an issue. After all I want to just read the diagnostic and see whats happening instantly and a more compressed format might not help there.

I misthought that the output was the -export-fixes, but what you mean is the stdout of clang-tidy.

Could you please explain your motivation of catching clang-tidy stdout? --export-fixes emits everything of diagnostic to YAML even the diagnostic doesn't have fixes. I guess the reason is that you want code snippets that you could show to users? If so, I think this is a separate UX problem, since we have everything in the emitted YAML, and we could construct whatever messages we want from it. A simpler approach maybe:

run clang-tidy in parallel on whole project, and emits a deduplicated result (fixes.yaml).
run a postprocessing in your buildbot that constructs diagnostic messages from fixes.yaml, and store it somewhere.
do whatever you want with output from 1) and 2).

Step 1 could be done in upstream, probably via AllTUsExecutor, and deduplication can be done on the fly based on <CheckName>::<FilePath>::<FileOffset>; we still need clang-apply-replacement to deduplicate replacements; I'm happy to help with this. Step 2 could be done by your own, just a simple script.

At the moment clang-apply-replacements is called at the end of an clang-tidy run in run-clang-tidy.py That means we produce ~GBs of Yaml first, to then emit ~10MBs worth of it.

That's why I suggest using some sort of other space-efficient formats to store the fixes. My intuition is that the final deduplicated result shouldn't be too large (even for YAML), because 1) no duplication 2) these are actual diagnostics in code, a healthy codebase shouldn't contain lots of problem 3) you have mentioned that you use it for small projects :)

@hokein you and I seem to be making the same proposal :)

Could you please explain your motivation of catching clang-tidy stdout? --export-fixes emits everything of diagnostic to YAML even the diagnostic doesn't have fixes. I guess the reason is that you want code snippets that you could show to users? If so, I think this is a separate UX problem, since we have everything in the emitted YAML, and we could construct whatever messages we want from it.

A bit for pragmatic reasons and a bit precaution.

You are right with the code-snippet. I want to check for false-positives in new clang-tidy checks, if I can just scroll through and see the code-snippet in question it is just practical.
diagnostics in template-code might emit multiple warnings at the same code-position. There is a realistic chance that warning: xy happened here will be the same for all template-instantiations, and only the note: type 'XY' does not match gives the differentiating hint. If dedup happens _ONLY_ on the first warning I fear we might loose valid diagnostics! I did re-evaluate and it seems that the emitted yaml does not include the notes. That is an issues, for example the CSA output relies on the emitted notes that explain the path to the bug.
I originally implemented it for my buildbot which parses the check-name, location and so on and then gives an ordered output for each check in a module and so on. I extracted the essence for deduplication. -export-fixes still emits the clang-tidy diagnostics, so for my concrete use-case YAML based de-duplication brings no value in its current form, as my BB still struggles with the amount of stdout.

run clang-tidy in parallel on whole project, and emits a deduplicated result (fixes.yaml).

run a postprocessing in your buildbot that constructs diagnostic messages from fixes.yaml, and store it somewhere.

do whatever you want with output from 1) and 2).

Step 1 could be done in upstream, probably via AllTUsExecutor, and deduplication can be done on the fly based on <CheckName>::<FilePath>::<FileOffset>; we still need clang-apply-replacement to deduplicate replacements; I'm happy to help with this. Step 2 could be done by your own, just a simple script.

At the moment clang-apply-replacements is called at the end of an clang-tidy run in run-clang-tidy.py That means we produce ~GBs of Yaml first, to then emit ~10MBs worth of it.

That's why I suggest using some sort of other space-efficient formats to store the fixes. My intuition is that the final deduplicated result shouldn't be too large (even for YAML), because 1) no duplication 2) these are actual diagnostics in code, a healthy codebase shouldn't contain lots of problem 3) you have mentioned that you use it for small projects :)

To 3) I do use it for all kinds of projects, LLVM and Blender are currently the biggest ones. I want to go for LibreOffice, Chromium and so on as well. But right now the amount of noise is the biggest obstacle. My goal is not to check if the project is healthy/provide a service for the project, but check if _we_ have bugs in our checks and if code-transformation is correct, false positives, too much output, misleading messages, ...

To 2) LLVM is very chatty as well, I don't consider LLVM to be a bad code-base. Take readability-braces-around-statements for example. I want to test if the check transform all possible places correctly, LLVM does not follow this style and LLVM has a lot of big headers that implement functionality that are transitively included a lot. LLVM is the one that overflowed my 32GB of RAM :)

To 1) I do agree and the data presented support that. I suspect that Yaml-to-stdout Ratio is maybe 2/3:1? So in the analyzed case we end up with 10-15MB of data instead of ~600MB(all Yaml). Space optimization is something we can tackle after-wards as it does not seem to be pathological after deduplication.

In general: I somewhat consider this patch as rejected, I will still use it locally for my BB, but I think this revision should be closed. We could move the discussion here to the mailing-list if you want. It is ok to continue here as well, as we already started to make plans :)
My opinion is, that we should put as much of the deduplication into clang-tidy itself and not rely on tools like run-clang-tidy.py if we can.

So for me step 1. would be providing AllTUsExecutor in clang-tidy and make it parallel itself. For dedup we need hook the diagnostics. CSA has the BugReport class that could be hashed. clang-tidy currently doesn't have this, maybe a similar approach (or the same?) would help us out.

Push some fixes i did while working with this script, for reference of others
potentially looking at this patch.

Harbormaster completed remote builds in B25272: Diff 175063.Nov 22 2018, 10:34 AM

after countless attempts of fixing the unicode problem, it is finally done.
Remove all unnecessary whitespace
remove the xxx warnings generated. as well

This setup now runs in my BB and is a good approximation (for me) how the
deduplication should work in the future.

Still just for reference and documentation.

Harbormaster completed remote builds in B25370: Diff 175459.Nov 27 2018, 5:25 AM

make the script more useable in my buildbot context
reduce the test-files
fix unicode issues I encountered while using

Herald added a reviewer: serge-sans-paille. · View Herald TranscriptJan 17 2019, 11:26 AM

Harbormaster completed remote builds in B27005: Diff 182359.Jan 17 2019, 11:27 AM

LLVM is very chatty as well, I don't consider LLVM to be a bad code-base. Take readability-braces-around-statements for example.

Do we need a llvm-elide-braces-for-small-statements?

This would make a great pre-review check

In D54141#1362924, @MyDeveloperDay wrote:

LLVM is very chatty as well, I don't consider LLVM to be a bad code-base. Take readability-braces-around-statements for example.

Do we need a llvm-elide-braces-for-small-statements?

This would make a great pre-review check

IMHO wouldn't hurt. It could even be a readability, one. But we need general thought on how to deal with conflicting checks, in this case especially.
Maybe we could extend the readability-braces-around-statements check with its AntiCheck and make it configurable. Therefore collision cant be emitted?

In D54141#1289980, @hokein wrote:

If you're suggesting proceeding with this regex based solution, I

don't think that's a good idea. Why commit a hack which people will object to ever removing? Just see if we can do the right thing instead.

+1, my main concern is the complexity of the patch and maintenance burden of the python script.

I think these are reasonable concerns and to a degree I share them. At the same time, I worry we may be leaving useful functionality behind in favor of functionality that doesn't exist and doesn't appear to be moving forward. If we were to move forward with this patch, nothing prevents us from surfacing it more naturally later when we have the infrastructure in place for the better solution, correct?

At the moment clang-apply-replacements is called at the end of an clang-tidy run in run-clang-tidy.py That means we produce ~GBs of Yaml first, to then emit ~10MBs worth of it.

That's why I suggest using some sort of other space-efficient formats to store the fixes. My intuition is that the final deduplicated result shouldn't be too large (even for YAML), because 1) no duplication 2) these are actual diagnostics in code, a healthy codebase shouldn't contain lots of problem 3) you have mentioned that you use it for small projects :)

Re: #2, I don't think that assertion is true in practice. I expect there are plenty of projects that contain a lot of clang-tidy diagnostics, especially given that clang-tidy checks tend to have higher false positive rates. Even if clang-tidy checks were not so chatty, "shouldn't" and "don't" are very different measurements.

I'm not suggesting to plow full-steam-ahead with this patch or that the concerns raised here are invalid, but at the same time, I think it does solve a real problem and it would be a shame to lose a workable solution because something better might be possible. If work is taking place to actually implement that something better, then that's a different matter of course. I get the impression though that "something better" is an extensive amount of work compared to what's in front of us; am I misunderstanding?

In D54141#1291509, @JonasToth wrote:

My opinion is, that we should put as much of the deduplication into clang-tidy itself and not rely on tools like run-clang-tidy.py if we can.

Strong +1. TBH, I was unaware people used run-clang-tidy.py. ;-)

lebedev.ri resigned from this revision.Jul 10 2019, 2:42 PM

Herald added a project: Restricted Project. · View Herald TranscriptJul 10 2019, 2:42 PM

won't happen anymore realistically.

Herald added a subscriber: mgehre. · View Herald TranscriptFeb 21 2022, 9:26 AM

Revision Contents

Path

Size

clang-tidy/

tool/

run-clang-tidy.py

190 lines

run_clang_tidy.py

1 line

test_input/

out_csa_cmake.log

408 lines

out_performance_cmake.log

93 lines

test_log_parser.py

236 lines

docs/

ReleaseNotes.rst

3 lines

Commit	Tree	Parents	Author	Summary	Date
11dc914f89ad	656519a36b8b	6585f6a83046 5f38fd6e3d7b	Jonas Toth	Merge branch 'feature_ct_dedup' of github.com:JonasToth/clang-tools-extra into… (Show More…)	Nov 6 2018, 2:20 AM
6585f6a83046	656519a36b8b	16f37a8f6755	Jonas Toth	[clang-tidy] add deduplication support for run-clang-tidy.py (Show More…)	Nov 6 2018, 2:05 AM
5f38fd6e3d7b	656519a36b8b	16f37a8f6755	Jonas Toth	[Fix] parser resetting only if possible	Nov 6 2018, 2:05 AM
16f37a8f6755	3df2ed5dde25	78d9754c90cc	Jonas Toth	[Misc] remove whitespace at end of file	Nov 6 2018, 2:01 AM
78d9754c90cc	07f61829c340	ee3a8a419f20	Jonas Toth	[Test] add file parsing tests	Nov 6 2018, 2:00 AM
ee3a8a419f20	41fef09ea9ea	cdc11f4dd131	Jonas Toth	[Misc] shorten performance log	Nov 6 2018, 1:48 AM
cdc11f4dd131	3f0b2f5adcf6	cc92d25fbbaa	Jonas Toth	[Misc] remove noise completly	Nov 6 2018, 1:46 AM
cc92d25fbbaa	170af36cda91	472a3fdafff8	Jonas Toth	[Misc] remove noise output from log	Nov 6 2018, 1:43 AM
472a3fdafff8	2a272616bfdf	c6ceca1f80ed	Jonas Toth	[Misc] clean up test data	Nov 6 2018, 1:41 AM
c6ceca1f80ed	522bb3a1bfd0	4fd1be761bdd	Jonas Toth	[Feature] hide deduplication behind command line option	Nov 6 2018, 12:37 AM
4fd1be761bdd	55dd34782d69	0d2cc978c037	Jonas Toth	[Fix] move parser out of each thread to remove final duplicates	Nov 5 2018, 2:01 PM
0d2cc978c037	95ae45a9c56b	3ae6b5a6509c	Jonas Toth	[Feature] switch to sha256 hashes for deduplication (Show More…)	Nov 5 2018, 1:27 PM
3ae6b5a6509c	b62e9dae4802	b28415ac8ac8	Jonas Toth	[Refactor] move deduplication code around	Nov 5 2018, 12:56 PM
b28415ac8ac8	e37ef4c8f4c1	e94f2c57cf49	Jonas Toth	[Fix] get run-clang-tidy.py running again	Nov 5 2018, 11:43 AM
e94f2c57cf49	bbbdff8f72ea	e9837be545ab	Jonas Toth	[Feature] use the deduplication parser in run-clang-tidy.py	Nov 5 2018, 11:23 AM
e9837be545ab	7625eee1fcfa	147f0b28a492	Jonas Toth	[Refactor] remove old utility.py file	Nov 5 2018, 11:17 AM
147f0b28a492	f8939769f287	d5d8cc756383	Jonas Toth	[Refactor] move parsing code into run-clang-tidy, fix tests	Nov 5 2018, 11:16 AM
d5d8cc756383	7735ec381d03	c266039640dc	Jonas Toth	[Fix] logic bug in deduplication, resetting shall not clean dedup data	Nov 5 2018, 11:08 AM
c266039640dc	e634cac269ab	f95be2bdb844	Jonas Toth	[Feature] implement deduplication parser and test on real data	Nov 5 2018, 11:02 AM
f95be2bdb844	33ea8ed15893	1ecbb5f7cc4e	Jonas Toth	[Feature] implement basic data structure for efficient deduplication	Nov 5 2018, 8:52 AM
1ecbb5f7cc4e	38f394033170	9de4f34bce7f	Jonas Toth	[Misc] more test data	Nov 5 2018, 7:19 AM
9de4f34bce7f	6cfdbbdf6393	e936bbdce059	Jonas Toth	[Feature] start implementation for diagnostic deduplication in run-clang-tidy. (Show More…)	Nov 5 2018, 4:23 AM

Diff 172726

clang-tidy/tool/run-clang-tidy.py

Show All 32 Lines
Compilation database setup:		Compilation database setup:
http://clang.llvm.org/docs/HowToSetupToolingForLLVM.html		http://clang.llvm.org/docs/HowToSetupToolingForLLVM.html
"""		"""

from __future__ import print_function		from __future__ import print_function

import argparse		import argparse
import glob		import glob
		import hashlib
import json		import json
import multiprocessing		import multiprocessing
import os		import os
import re		import re
import shutil		import shutil
import subprocess		import subprocess
import sys		import sys
import tempfile		import tempfile
import threading		import threading
import traceback		import traceback
import yaml		import yaml

is_py2 = sys.version[0] == '2'		is_py2 = sys.version[0] == '2'

if is_py2:		if is_py2:
import Queue as queue		import Queue as queue
else:		else:
import queue as queue		import queue as queue


		class Diagnostic(object):
		"""
		This class represents a parsed diagnostic message coming from clang-tidy
		output. While parsing the raw output each new diagnostic will incrementally
		build a temporary object of this class. Once the end of the diagnotic
		message is found its content is hashed with SHA256 and stored in a set.
		"""

		def __init__(self, path, line, column, diag):
		"""
		Start initializing this object. The source location is always known
		as it is emitted first and always in a single line.
		`diag` will contain all warning/error/note information until the first
		line-break. These are very uncommon but for example CSA's
		PaddingChecker emits a multi-line warning containing the optimal
		layout of a record. These additional lines must be added after
		creation of the `Diagnostic`.
		"""
		self._path = path
		self._line = line
		self._column = column
		self._diag = diag
		self._additional = ""

		def add_additional_line(self, line):
		"""Store more additional information line per line while parsing."""
		self._additional += "\n" + line

		def get_fingerprint(self):
		"""Return a secure fingerprint (SHA256 hash) of the diagnostic."""
		return hashlib.sha256(self.__str__().encode("utf-8")).hexdigest()

		def __str__(self):
		"""Transform the object back into a raw diagnostic."""
		return self._path + ":" + str(self._line) + ":" + str(self._column)\
		+ ": " + self._diag + self._additional


		class Deduplication(object):
		"""
		This class provides an interface to deduplicate diagnostics emitted from
		`clang-tidy`. It maintains a `set` of SHA 256 hashes of the diagnostics
		and allows to query if an diagnostic is already emitted
		(according to the corresponding hash of the diagnostic string!).
		"""

		def __init__(self):
		"""Initializes an empty set."""
		self._set = set()

		def insert_and_query(self, diag):
		"""
		This method returns True if the `diag` was NOT emitted already
		signaling that the parser shall store/emit this diagnostic.
		If the `diag` was stored already this method return False and has
		no effect.
		"""
		fp = diag.get_fingerprint()
		if fp not in self._set:
		self._set.add(fp)
		return True
		return False


		def _is_valid_diag_match(match_groups):
		"""Return true if all elements in `match_groups` are not None."""
		return all(g is not None for g in match_groups)


		def _diag_from_match(match_groups):
		"""Helper function to create a diagnostic object from a regex match."""
		return Diagnostic(
		str(match_groups[0]), int(match_groups[1]), int(match_groups[2]),
		str(match_groups[3]) + ": " + str(match_groups[4]))


		class ParseClangTidyDiagnostics(object):
		"""
		This class is a stateful parser for `clang-tidy` diagnostic output.
		The parser collects all unique diagnostics that can be emitted after
		deduplication.
		"""

		def __init__(self):
		super(ParseClangTidyDiagnostics, self).__init__()
		self._diag_re = re.compile(
		r"^(.+):(\d+):(\d+): (error\|warning): (.*)$")
		self._current_diag = None

		self._dedup = Deduplication()
		self._uniq_diags = list()

		def reset_parser(self):
		"""
		Clean the parsing data to prepare for another set of output from
		`clang-tidy`. The deduplication is not cleaned because that data
		is required between multiple parsing runs. The diagnostics are cleaned
		as the parser assumes the new unique diagnostics are consumed before
		the parser is reset.
		"""
		self._current_diag = None
		self._uniq_diags = list()

		def get_diags(self):
		"""
		Returns a list of diagnostics that can be emitted after parsing the
		full output of a `clang-tidy` invocation.
		The list contains no duplicates.
		"""
		return self._uniq_diags

		def parse_string(self, input_str):
		"""Parse a string, e.g. captured stdout."""
		self._parse_lines(input_str.splitlines())

		def _parse_line(self, line):
		"""Parses one line and returns nothing."""
		match = self._diag_re.match(line)

		# A new diagnostic is found (either error or warning).
		if match and _is_valid_diag_match(match.groups()):
		self._handle_new_diag(match.groups())

		# There was no new diagnostic but a previous diagnostic is in flight.
		# Interpret this situation as additional output like notes or
		# code-pointers from the diagnostic that is in flight.
		elif not match and self._current_diag:
		self._current_diag.add_additional_line(line)

		# There was no diagnostic in flight and this line did not create a
		# new one. This situation should not occur, but might happen if
		# `clang-tidy` emits information before warnings start.
		else:
		return

		def _handle_new_diag(self, match_groups):
		"""Potentially store an in-flight diagnostic and create a new one."""
		self._register_diag()
		self._current_diag = _diag_from_match(match_groups)

		def _register_diag(self):
		"""
		Stores a potential in-flight diagnostic if it is a new unique message.
		"""
		# The current in-flight diagnostic was not emitted before, therefor
		# it should be stored as a new unique diagnostic.
		if self._current_diag and \
		self._dedup.insert_and_query(self._current_diag):
		self._uniq_diags.append(self._current_diag)

		def _parse_lines(self, line_list):
		"""Parse a list of lines without \\n at the end of each string."""
		assert self._current_diag is None, \
		"Parser not in a clean state to restart parsing"

		for line in line_list:
		self._parse_line(line.rstrip())
		# Register the last diagnostic after all input is parsed.
		self._register_diag()

		def _parse_file(self, filename):
		"""Helper to parse a full file, for testing purposes only."""
		with open(filename, "r") as input_file:
		self._parse_lines(input_file.readlines())


def find_compilation_database(path):		def find_compilation_database(path):
"""Adjusts the directory until a compilation database is found."""		"""Adjusts the directory until a compilation database is found."""
result = './'		result = './'
while not os.path.isfile(os.path.join(result, path)):		while not os.path.isfile(os.path.join(result, path)):
if os.path.realpath(result) == '/':		if os.path.realpath(result) == '/':
print('Error: could not find compilation database.')		print('Error: could not find compilation database.')
sys.exit(1)		sys.exit(1)
result += '../'		result += '../'
▲ Show 20 Lines • Show All 80 Lines • ▼ Show 20 Lines	def apply_fixes(args, tmpdir):
if args.format:		if args.format:
invocation.append('-format')		invocation.append('-format')
if args.style:		if args.style:
invocation.append('-style=' + args.style)		invocation.append('-style=' + args.style)
invocation.append(tmpdir)		invocation.append(tmpdir)
subprocess.call(invocation)		subprocess.call(invocation)


def run_tidy(args, tmpdir, build_path, queue, lock, failed_files):		def run_tidy(args, tmpdir, build_path, queue, lock, failed_files, parser):
"""Takes filenames out of queue and runs clang-tidy on them."""		"""Takes filenames out of queue and runs clang-tidy on them."""
while True:		while True:
name = queue.get()		name = queue.get()
invocation = get_tidy_invocation(name, args.clang_tidy_binary, args.checks,		invocation = get_tidy_invocation(name, args.clang_tidy_binary, args.checks,
tmpdir, build_path, args.header_filter,		tmpdir, build_path, args.header_filter,
args.extra_arg, args.extra_arg_before,		args.extra_arg, args.extra_arg_before,
args.quiet, args.config)		args.quiet, args.config)

proc = subprocess.Popen(invocation, stdout=subprocess.PIPE, stderr=subprocess.PIPE)		proc = subprocess.Popen(invocation, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
output, err = proc.communicate()		output, err = proc.communicate()
if proc.returncode != 0:		if proc.returncode != 0:
failed_files.append(name)		failed_files.append(name)

with lock:		with lock:
sys.stdout.write(' '.join(invocation) + '\n' + output.decode('utf-8') + '\n')		invoc = ' '.join(invocation) + '\n'
		if parser:
		parser.parse_string(output.decode('utf-8'))
		sys.stdout.write(invoc\
		+ "\n".join([str(diag) for diag in parser.get_diags()])\
		+ '\n')
		parser.reset_parser()
		else:
		sys.stdout.write(invoc + output.decode('utf-8') + '\n')

if len(err) > 0:		if len(err) > 0:
sys.stderr.write(err.decode('utf-8') + '\n')		sys.stderr.write(err.decode('utf-8') + '\n')

queue.task_done()		queue.task_done()


def main():		def main():
parser = argparse.ArgumentParser(description='Runs clang-tidy over all files '		parser = argparse.ArgumentParser(description='Runs clang-tidy over all files '
'in a compilation database. Requires '		'in a compilation database. Requires '
'clang-tidy and clang-apply-replacements in '		'clang-tidy and clang-apply-replacements in '
'$PATH.')		'$PATH.')
Show All 38 Lines	parser.add_argument('-extra-arg', dest='extra_arg',
help='Additional argument to append to the compiler '		help='Additional argument to append to the compiler '
'command line.')		'command line.')
parser.add_argument('-extra-arg-before', dest='extra_arg_before',		parser.add_argument('-extra-arg-before', dest='extra_arg_before',
action='append', default=[],		action='append', default=[],
help='Additional argument to prepend to the compiler '		help='Additional argument to prepend to the compiler '
'command line.')		'command line.')
parser.add_argument('-quiet', action='store_true',		parser.add_argument('-quiet', action='store_true',
help='Run clang-tidy in quiet mode')		help='Run clang-tidy in quiet mode')
		parser.add_argument('-deduplicate', action='store_true',
		help='Deduplicate diagnostic message from clang-tidy')
args = parser.parse_args()		args = parser.parse_args()

db_path = 'compile_commands.json'		db_path = 'compile_commands.json'

if args.build_path is not None:		if args.build_path is not None:
build_path = args.build_path		build_path = args.build_path
else:		else:
# Find our database		# Find our database
Show All 29 Lines	def main():

return_code = 0		return_code = 0
try:		try:
# Spin up a bunch of tidy-launching threads.		# Spin up a bunch of tidy-launching threads.
task_queue = queue.Queue(max_task)		task_queue = queue.Queue(max_task)
# List of files with a non-zero return code.		# List of files with a non-zero return code.
failed_files = []		failed_files = []
lock = threading.Lock()		lock = threading.Lock()
		parser = None
		if args.deduplicate:
		parser = ParseClangTidyDiagnostics()
for _ in range(max_task):		for _ in range(max_task):
t = threading.Thread(target=run_tidy,		t = threading.Thread(target=run_tidy,
args=(args, tmpdir, build_path, task_queue, lock, failed_files))		args=(args, tmpdir, build_path, task_queue, lock, failed_files, parser))
t.daemon = True		t.daemon = True
t.start()		t.start()

# Fill the queue with files.		# Fill the queue with files.
for name in files:		for name in files:
if file_name_re.search(name):		if file_name_re.search(name):
task_queue.put(name)		task_queue.put(name)

Show All 37 Lines

clang-tidy/tool/run_clang_tidy.py

This file was added.

Property	Old Value	New Value
File Mode	null	120000

				run-clang-tidy.py
				JonasTothAuthorUnsubmitted Not Done Reply Inline Actions This simlink is required for my unittests, I don't know how to add the added tests in the `lit` test-suite so there is no change there yet. A bit of guidance there would be nice :) JonasToth: This simlink is required for my unittests, I don't know how to add the added tests in the `lit`…
				No newline at end of file

clang-tidy/tool/test_input/out_csa_cmake.log

This file was added.

				/project/git/Source/kwsys/testCommandLineArguments1.cxx:83:16: warning: Null pointer passed as an argument to a 'nonnull' parameter [clang-analyzer-core.NonNullParamChecker]
				strcmp(valid_unused_args[cc], newArgv[cc]) != 0) {
				^
				/project/git/Source/kwsys/testCommandLineArguments1.cxx:35:7: note: Assuming the condition is false
				if (!arg.Parse()) {
				^
				/project/git/Source/kwsys/testCommandLineArguments1.cxx:35:3: note: Taking false branch
				if (!arg.Parse()) {
				^
				/project/git/Source/kwsys/testCommandLineArguments1.cxx:39:7: note: Assuming 'n' is equal to 24
				if (n != 24) {
				^
				/project/git/Source/kwsys/testCommandLineArguments1.cxx:39:3: note: Taking false branch
				if (n != 24) {
				^
				/project/git/Source/kwsys/testCommandLineArguments1.cxx:43:7: note: Assuming 'm' is non-null
				if (!m \|\| strcmp(m, "test value") != 0) {
				^
				/project/git/Source/kwsys/testCommandLineArguments1.cxx:43:7: note: Left side of '\|\|' is false

				/project/git/Source/kwsys/testCommandLineArguments1.cxx:43:3: note: Taking false branch
				if (!m \|\| strcmp(m, "test value") != 0) {
				^
				/project/git/Source/kwsys/testCommandLineArguments1.cxx:47:3: note: Taking true branch
				if (p != "1") {
				^
				/project/git/Source/kwsys/testCommandLineArguments1.cxx:54:3: note: Taking true branch
				if (m) {
				^
				/project/git/Source/kwsys/testCommandLineArguments1.cxx:71:7: note: Assuming 'newArgc' is equal to 9
				if (newArgc != 9) {
				^
				/project/git/Source/kwsys/testCommandLineArguments1.cxx:71:3: note: Taking false branch
				if (newArgc != 9) {
				^
				/project/git/Source/kwsys/testCommandLineArguments1.cxx:75:3: note: Loop condition is true. Entering loop body
				for (cc = 0; cc < newArgc; ++cc) {
				^
				/project/git/Source/kwsys/testCommandLineArguments1.cxx:79:5: note: Taking false branch
				if (cc >= 9) {
				^
				/project/git/Source/kwsys/testCommandLineArguments1.cxx:82:38: note: Left side of '&&' is false
				} else if (valid_unused_args[cc] &&
				^
				/project/git/Source/kwsys/testCommandLineArguments1.cxx:75:3: note: Loop condition is true. Entering loop body
				for (cc = 0; cc < newArgc; ++cc) {
				^
				/project/git/Source/kwsys/testCommandLineArguments1.cxx:77:5: note: Calling 'operator<<<std::char_traits<char>>'
				std::cout << "Unused argument[" << cc << "] = [" << newArgv[cc] << "]"
				^
				/usr/bin/../lib/gcc/x86_64-linux-gnu/7.3.0/../../../../include/c++/7.3.0/ostream:558:11: note: Assuming '__s' is null
				if (!__s)
				^
				/usr/bin/../lib/gcc/x86_64-linux-gnu/7.3.0/../../../../include/c++/7.3.0/ostream:558:7: note: Taking true branch
				if (!__s)
				^
				/project/git/Source/kwsys/testCommandLineArguments1.cxx:77:5: note: Returning from 'operator<<<std::char_traits<char>>'
				std::cout << "Unused argument[" << cc << "] = [" << newArgv[cc] << "]"
				^
				/project/git/Source/kwsys/testCommandLineArguments1.cxx:79:5: note: Taking false branch
				if (cc >= 9) {
				^
				/project/git/Source/kwsys/testCommandLineArguments1.cxx:82:16: note: Left side of '&&' is true
				} else if (valid_unused_args[cc] &&
				^
				/project/git/Source/kwsys/testCommandLineArguments1.cxx:83:16: note: Null pointer passed as an argument to a 'nonnull' parameter
				strcmp(valid_unused_args[cc], newArgv[cc]) != 0) {
				^
				/project/git/Utilities/cmcurl/lib/urldata.h:1209:8: warning: Excessive padding in 'struct UrlState' (61 padding bytes, where 5 is optimal).
				Optimal fields order:
				conn_cache,
				lastconnect,
				headerbuff,
				headersize,
				buffer,
				ulbuf,
				current_speed,
				first_host,
				session,
				sessionage,
				scratch,
				prev_signal,
				resolver,
				most_recent_ftp_entrypath,
				crlf_conversions,
				pathbuffer,
				path,
				range,
				resume_from,
				rtsp_next_client_CSeq,
				rtsp_next_server_CSeq,
				rtsp_CSeq_recv,
				infilesize,
				drain,
				fread_func,
				in,
				stream_depends_on,
				keeps_speed,
				expiretime,
				authhost,
				authproxy,
				timeoutlist,
				timenode,
				digest,
				proxydigest,
				tempwrite,
				expires,
				first_remote_port,
				tempcount,
				os_errno,
				httpversion,
				stream_weight,
				multi_owned_by_easy,
				this_is_a_follow,
				refused_stream,
				errorbuf,
				allow_port,
				authproblem,
				ftp_trying_alternative,
				wildcardmatch,
				expect100header,
				prev_block_had_trailing_cr,
				slash_removed,
				use_range,
				rangestringalloc,
				done,
				stream_depends_e,
				consider reordering the fields or adding explicit padding members [clang-analyzer-optin.performance.Padding]
				struct UrlState {
				^
				project/git/Utilities/cmcurl/lib/urldata.h:1457:8: warning: Excessive padding in 'struct UserDefined' (120 padding bytes, where 0 is optimal).
				Optimal fields order:
				err,
				debugdata,
				errorbuffer,
				proxyport,
				out,
				in_set,
				writeheader,
				rtp_out,
				use_port,
				httpauth,
				proxyauth,
				socks5auth,
				followlocation,
				maxredirs,
				postfields,
				seek_func,
				postfieldsize,
				fwrite_func,
				fwrite_header,
				fwrite_rtp,
				fread_func_set,
				fprogress,
				fxferinfo,
				fdebug,
				ioctl_func,
				fsockopt,
				sockopt_client,
				fopensocket,
				opensocket_client,
				fclosesocket,
				closesocket_client,
				seek_client,
				convfromnetwork,
				convtonetwork,
				convfromutf8,
				progress_client,
				ioctl_client,
				timeout,
				connecttimeout,
				accepttimeout,
				happy_eyeballs_timeout,
				server_response_timeout,
				tftp_blksize,
				filesize,
				low_speed_limit,
				low_speed_time,
				max_send_speed,
				max_recv_speed,
				set_resume_from,
				headers,
				proxyheaders,
				httppost,
				quote,
				postquote,
				prequote,
				source_quote,
				source_prequote,
				source_postquote,
				telnet_options,
				resolve,
				connect_to,
				timevalue,
				httpversion,
				general_ssl,
				dns_cache_timeout,
				buffer_size,
				upload_buffer_size,
				private_data,
				http200aliases,
				ipver,
				max_filesize,
				ssh_keyfunc,
				ssh_keyfunc_userp,
				ssh_auth_types,
				new_file_perms,
				new_directory_perms,
				allowed_protocols,
				redir_protocols,
				mail_rcpt,
				rtspversion,
				chunk_bgn,
				chunk_end,
				fnmatch,
				fnmatch_data,
				gssapi_delegation,
				tcp_keepidle,
				tcp_keepintvl,
				maxconnects,
				expect_100_timeout,
				stream_depends_on,
				stream_dependents,
				resolver_start,
				resolver_start_client,
				ssl,
				proxy_ssl,
				mimepost,
				str,
				keep_post,
				localportrange,
				is_fread_set,
				is_fwrite_set,
				timecondition,
				httpreq,
				proxytype,
				ftp_filemethod,
				ftp_create_missing_dirs,
				use_netrc,
				use_ssl,
				ftpsslauth,
				ftp_ccc,
				scope_id,
				rtspreq,
				stream_weight,
				localport,
				free_referer,
				tftp_no_options,
				sep_headers,
				cookiesession,
				crlf,
				strip_path_slash,
				ssh_compression,
				get_filetime,
				tunnel_thru_httpproxy,
				prefer_ascii,
				ftp_append,
				ftp_list_only,
				ftp_use_port,
				hide_progress,
				http_fail_on_error,
				http_keep_sending_on_error,
				http_follow_location,
				http_transfer_encoding,
				allow_auth_to_other_hosts,
				include_header,
				http_set_referer,
				http_auto_referer,
				opt_no_body,
				upload,
				2 warnings generated.

				verbose,
				krb,
				reuse_forbid,
				reuse_fresh,
				ftp_use_epsv,
				ftp_use_eprt,
				ftp_use_pret,
				no_signal,
				global_dns_cache,
				tcp_nodelay,
				ignorecl,
				ftp_skip_ip,
				connect_only,
				http_te_skip,
				http_ce_skip,
				proxy_transfer_mode,
				sasl_ir,
				wildcard_enabled,
				tcp_keepalive,
				tcp_fastopen,
				ssl_enable_npn,
				ssl_enable_alpn,
				path_as_is,
				pipewait,
				suppress_connect_headers,
				dns_shuffle_addresses,
				stream_depends_e,
				haproxyprotocol,
				abstract_unix_socket,
				disallow_username_in_url,
				consider reordering the fields or adding explicit padding members [clang-analyzer-optin.performance.Padding]
				struct UserDefined {
				^
				/project/git/Utilities/cmcurl/lib/urldata.h:1209:8: warning: Excessive padding in 'struct UrlState' (61 padding bytes, where 5 is optimal).
				Optimal fields order:
				conn_cache,
				lastconnect,
				headerbuff,
				headersize,
				buffer,
				ulbuf,
				current_speed,
				first_host,
				session,
				sessionage,
				scratch,
				prev_signal,
				resolver,
				most_recent_ftp_entrypath,
				crlf_conversions,
				pathbuffer,
				path,
				range,
				resume_from,
				rtsp_next_client_CSeq,
				rtsp_next_server_CSeq,
				rtsp_CSeq_recv,
				infilesize,
				drain,
				fread_func,
				in,
				stream_depends_on,
				keeps_speed,
				expiretime,
				authhost,
				authproxy,
				timeoutlist,
				timenode,
				digest,
				proxydigest,
				tempwrite,
				expires,
				first_remote_port,
				tempcount,
				os_errno,
				httpversion,
				stream_weight,
				multi_owned_by_easy,
				this_is_a_follow,
				refused_stream,
				errorbuf,
				allow_port,
				authproblem,
				ftp_trying_alternative,
				wildcardmatch,
				expect100header,
				prev_block_had_trailing_cr,
				slash_removed,
				use_range,
				rangestringalloc,
				done,
				stream_depends_e,
				consider reordering the fields or adding explicit padding members [clang-analyzer-optin.performance.Padding]
				struct UrlState {
				^
				/project/git/Tests/CMakeLib/testUTF8.cxx:11:3: warning: 3rd function call argument is an uninitialized value [clang-analyzer-core.CallAndMessage]
				printf("[0x%02X,0x%02X,0x%02X,0x%02X]", static_cast<int>(d[0]),
				^
				/project/git/Tests/CMakeLib/testUTF8.cxx:93:3: note: Loop condition is false. Execution continues on line 98
				for (test_utf8_entry const* e = good_entry; e->n; ++e) {
				^
				/project/git/Tests/CMakeLib/testUTF8.cxx:98:3: note: Loop condition is true. Entering loop body
				for (test_utf8_char const* c = bad_chars; (*c)[0]; ++c) {
				^
				/project/git/Tests/CMakeLib/testUTF8.cxx:99:5: note: Taking false branch
				if (!decode_bad(*c)) {
				^
				/project/git/Tests/CMakeLib/testUTF8.cxx:98:3: note: Loop condition is true. Entering loop body
				for (test_utf8_char const* c = bad_chars; (*c)[0]; ++c) {
				^
				/project/git/Tests/CMakeLib/testUTF8.cxx:99:10: note: Calling 'decode_bad'
				if (!decode_bad(*c)) {
				^
				/project/git/Tests/CMakeLib/testUTF8.cxx:80:7: note: Assuming 'e' is null
				if (e) {
				^
				/project/git/Tests/CMakeLib/testUTF8.cxx:80:3: note: Taking false branch
				if (e) {
				^
				/project/git/Tests/CMakeLib/testUTF8.cxx:85:3: note: Calling 'report_bad'
				report_bad(true, s);
				^
				/project/git/Tests/CMakeLib/testUTF8.cxx:46:32: note: '?' condition is true
				printf("%s: decoding bad ", passed ? "pass" : "FAIL");
				^
				/project/git/Tests/CMakeLib/testUTF8.cxx:47:3: note: Calling 'test_utf8_char_print'
				test_utf8_char_print(c);
				^
				/project/git/Tests/CMakeLib/testUTF8.cxx:11:3: note: 3rd function call argument is an uninitialized value
				printf("[0x%02X,0x%02X,0x%02X,0x%02X]", static_cast<int>(d[0]),
				^
				/project/git/Source/cmServer.cxx:519:3: warning: Call to virtual function during construction [clang-analyzer-optin.cplusplus.VirtualCall]
				AddNewConnection(connection);
				^
				/project/git/Source/cmServer.cxx:519:3: note: This constructor of an object of type 'cmServerBase' has not returned when the virtual method was called
				/project/git/Source/cmServer.cxx:519:3: note: Call to virtual function during construction

clang-tidy/tool/test_input/out_performance_cmake.log

This file was added.

				/project/git/Source/kwsys/Glob.cxx:200:28: warning: string concatenation results in allocation of unnecessary temporary strings; consider using 'operator+=' or 'string::append()' instead [performance-inefficient-string-concatenation]
				realname = dir + "/" + fname;
				^
				/project/git/Source/kwsys/Glob.cxx:223:47: warning: string concatenation results in allocation of unnecessary temporary strings; consider using 'operator+=' or 'string::append()' instead [performance-inefficient-string-concatenation]
				"' failed! Reason: '" + realPathErrorMessage + "'"));
				^
				/project/git/Source/kwsys/Glob.cxx:253:42: warning: string concatenation results in allocation of unnecessary temporary strings; consider using 'operator+=' or 'string::append()' instead [performance-inefficient-string-concatenation]
				message += canonicalPath + "/" + fname;
				^
				/project/git/Source/kwsys/Glob.cxx:305:28: warning: string concatenation results in allocation of unnecessary temporary strings; consider using 'operator+=' or 'string::append()' instead [performance-inefficient-string-concatenation]
				realname = dir + "/" + fname;
				^
				/project/git/Source/kwsys/SystemTools.cxx:1993:25: warning: 'find_first_of' called with a string literal consisting of a single character; consider using the more effective overload accepting a character [performance-faster-string-find]
				if (ret.find_first_of(" ") != std::string::npos) {
				^~~
				' '
				/project/git/Source/kwsys/SystemTools.cxx:2068:17: warning: local copy 'source_name' of the variable 'source' is never modified; consider avoiding the copy [performance-unnecessary-copy-initialization]
				std::string source_name = source;
				^
				const &
				/project/git/Source/kwsys/SystemTools.cxx:2212:19: warning: local copy 'source_name' of the variable 'source' is never modified; consider avoiding the copy [performance-unnecessary-copy-initialization]
				std::string source_name = source;
				^
				const &
				/project/git/Source/kwsys/SystemTools.cxx:3050:49: warning: 'rfind' called with a string literal consisting of a single character; consider using the more effective overload accepting a character [performance-faster-string-find]
				std::string::size_type slashPos = dir.rfind("/");
				^~~
				'/'
				/project/git/Source/kwsys/SystemTools.cxx:3207:32: warning: std::move of the const expression has no effect; remove std::move() [performance-move-const-arg]
				out_components.push_back(std::move(*i));
				^~~~~~~~~~ ~
				/project/git/Source/kwsys/SystemTools.cxx:3638:15: warning: local copy 'data' of the variable 'str' is never modified; consider avoiding the copy [performance-unnecessary-copy-initialization]
				std::string data(str);
				^
				const &
				/project/git/Source/kwsys/SystemTools.cxx:3688:47: warning: 'rfind' called with a string literal consisting of a single character; consider using the more effective overload accepting a character [performance-faster-string-find]
				std::string::size_type slash_pos = fn.rfind("/");
				^~~
				'/'
				/project/git/Source/kwsys/SystemInformation.cxx:1340:28: warning: 'rfind' called with a string literal consisting of a single character; consider using the more effective overload accepting a character [performance-faster-string-find]
				size_t at = file.rfind("/");
				^~~
				'/'
				/project/git/Source/kwsys/SystemInformation.cxx:3354:23: warning: 'find' called with a string literal consisting of a single character; consider using the more effective overload accepting a character [performance-faster-string-find]
				pos = buffer.find(":", pos);
				^~~
				':'
				/project/git/Source/kwsys/SystemInformation.cxx:3355:31: warning: 'find' called with a string literal consisting of a single character; consider using the more effective overload accepting a character [performance-faster-string-find]
				size_t pos2 = buffer.find("\n", pos);
				^~~~
				'\n'
				/project/git/Source/kwsys/SystemInformation.cxx:4605:43: warning: 'find' called with a string literal consisting of a single character; consider using the more effective overload accepting a character [performance-faster-string-find]
				size_t pos2 = this->SysCtlBuffer.find("\n", pos);
				^~~~
				'\n'
				/project/git/Source/kwsys/SystemInformation.cxx:5407:29: warning: 'find' called with a string literal consisting of a single character; consider using the more effective overload accepting a character [performance-faster-string-find]
				while ((pos = output.find("\r", pos)) != std::string::npos) {
				^~~~
				'\r'
				/project/git/Source/kwsys/SystemInformation.cxx:5413:29: warning: 'find' called with a string literal consisting of a single character; consider using the more effective overload accepting a character [performance-faster-string-find]
				while ((pos = output.find("\n", pos)) != std::string::npos) {
				^~~~
				'\n'
				/project/git/Utilities/cmjsoncpp/include/json/value.h:235:5: warning: move constructors should be marked noexcept [performance-noexcept-move-constructor]
				CZString(CZString&& other);
				^
				/project/git/Utilities/cmjsoncpp/include/json/value.h:241:15: warning: move assignment operators should be marked noexcept [performance-noexcept-move-constructor]
				CZString& operator=(CZString&& other);
				^
				/project/git/Utilities/cmjsoncpp/include/json/value.h:326:3: warning: move constructors should be marked noexcept [performance-noexcept-move-constructor]
				Value(Value&& other);
				^
				/project/git/Utilities/cmjsoncpp/src/lib_json/json_reader.cpp:1998:53: warning: the parameter 'key' is copied for each invocation but only used as a const reference; consider making it a const reference [performance-unnecessary-value-param]
				Value& CharReaderBuilder::operator[](JSONCPP_STRING key)
				^
				/project/git/Utilities/cmjsoncpp/include/json/value.h:235:5: warning: move constructors should be marked noexcept [performance-noexcept-move-constructor]
				CZString(CZString&& other);
				^
				/project/git/Utilities/cmjsoncpp/include/json/value.h:241:15: warning: move assignment operators should be marked noexcept [performance-noexcept-move-constructor]
				CZString& operator=(CZString&& other);
				^
				/project/git/Utilities/cmjsoncpp/include/json/value.h:326:3: warning: move constructors should be marked noexcept [performance-noexcept-move-constructor]
				Value(Value&& other);
				^
				/project/git/Utilities/cmjsoncpp/src/lib_json/json_value.cpp:278:18: warning: move constructors should be marked noexcept [performance-noexcept-move-constructor]
				Value::CZString::CZString(CZString&& other)
				^
				/project/git/Utilities/cmjsoncpp/src/lib_json/json_value.cpp:302:35: warning: move assignment operators should be marked noexcept [performance-noexcept-move-constructor]
				Value::CZString& Value::CZString::operator=(CZString&& other) {
				^
				/project/git/Utilities/cmjsoncpp/src/lib_json/json_value.cpp:490:8: warning: move constructors should be marked noexcept [performance-noexcept-move-constructor]
				Value::Value(Value&& other) {
				^

clang-tidy/tool/test_log_parser.py

This file was added.

				#!/usr/bin/env python
				# -- coding: utf-8 --

				import unittest
				from run_clang_tidy import Diagnostic, Deduplication
				from run_clang_tidy import ParseClangTidyDiagnostics, _is_valid_diag_match


				class TestDiagnostics(unittest.TestCase):
				"""Test fingerprinting diagnostic messages"""

				def test_construction(self):
				d = Diagnostic("/home/user/project/my_file.h", 24, 4,
				"warning: Do not do this thing [warning-category]")
				self.assertIsNotNone(d)
				self.assertEqual(str(d),
				"/home/user/project/my_file.h:24:4: warning: Do not do this thing [warning-category]")

				d.add_additional_line(" MyCodePiece();")
				d.add_additional_line(" ^")

				self.assertEqual(str(d),
				"/home/user/project/my_file.h:24:4: warning: Do not do this thing [warning-category]"
				"\n MyCodePiece();"
				"\n ^")


				class TestDeduplication(unittest.TestCase):
				"""Test the `DiagEssence` based deduplication of diagnostic messages."""

				def test_construction(self):
				self.assertIsNotNone(Deduplication())

				def test_dedup(self):
				dedup = Deduplication()
				d = Diagnostic("/home/user/project/my_file.h", 24, 4,
				"warning: Do not do this thing [warning-category]")
				self.assertTrue(dedup.insert_and_query(d))
				self.assertFalse(dedup.insert_and_query(d))

				d2 = Diagnostic("/home/user/project/my_file.h", 24, 4,
				"warning: Do not do this thing [warning-category]")
				d2.add_additional_line(" MyCodePiece();")
				d2.add_additional_line(" ^")
				self.assertTrue(dedup.insert_and_query(d2))
				self.assertFalse(dedup.insert_and_query(d2))

				d3 = Diagnostic("/home/user/project/my_file.h", 24, 4,
				"warning: Do not do this thing [warning-category]")
				self.assertFalse(dedup.insert_and_query(d3))

				class TestLinewiseParsing(unittest.TestCase):
				def test_construction(self):
				self.assertIsNotNone(ParseClangTidyDiagnostics())

				def test_valid_diags_regex(self):
				pp = ParseClangTidyDiagnostics()

				warning = "/home/user/project/my_file.h:123:1: warning: don't do it [no]"
				m = pp._diag_re.match(warning)
				self.assertTrue(m)
				self.assertTrue(_is_valid_diag_match(m.groups()))

				error = "/home/user/project/my_file.h:1:110: error: wrong! [not-ok]"
				m = pp._diag_re.match(error)
				self.assertTrue(m)
				self.assertTrue(_is_valid_diag_match(m.groups()))

				hybrid = "/home/user/project/boo.cpp:30:42: error: wrong! [not-ok,bad]"
				m = pp._diag_re.match(hybrid)
				self.assertTrue(m)
				self.assertTrue(_is_valid_diag_match(m.groups()))

				note = "/home/user/project/my_file.h:1:110: note: alksdj"
				m = pp._diag_re.match(note)
				self.assertFalse(m)

				garbage = "not a path:not_a_number:110: gibberish"
				m = pp._diag_re.match(garbage)
				self.assertFalse(m)

				def test_single_diagnostics(self):
				pp = ParseClangTidyDiagnostics()
				example_warning = [
				"/project/git/Source/kwsys/Terminal.c:53:21: warning: use of a signed integer operand with a binary bitwise operator [hicpp-signed-bitwise]",
				]
				pp._parse_lines(example_warning)
				self.assertEqual(
				str(pp.get_diags()[0]),
				"/project/git/Source/kwsys/Terminal.c:53:21: warning: use of a signed integer operand with a binary bitwise operator [hicpp-signed-bitwise]"
				)

				def test_no_diag(self):
				pp = ParseClangTidyDiagnostics()
				garbage_lines = \
				"""
				hicpp-no-array-decay
				hicpp-no-assembler
				hicpp-no-malloc
				hicpp-noexcept-move
				hicpp-signed-bitwise
				hicpp-special-member-functions
				hicpp-static-assert
				hicpp-undelegated-constructor
				hicpp-use-auto
				hicpp-use-emplace
				hicpp-use-equals-default
				hicpp-use-equals-delete
				hicpp-use-noexcept
				hicpp-use-nullptr
				hicpp-use-override
				hicpp-vararg

				clang-apply-replacements version 8.0.0
				18 warnings generated.
				36 warnings generated.
				Suppressed 26 warnings (26 in non-user code).
				Use -header-filter=.* to display errors from all non-system headers. Use -system-headers to display errors from system headers as well.

				61 warnings generated.
				122 warnings generated.
				Suppressed 122 warnings (122 in non-user code).
				Use -header-filter=.* to display errors from all non-system headers. Use -system-headers to display errors from system headers as well.

				clang-tidy -header-filter=^/project/git/.* -checks=-,hicpp- -export-fixes /tmp/tmpH8MVt0/tmpErKPl_.yaml -p=/project/git /project/git/Source/kwsys/Terminal.c
				"""
				pp.parse_string(garbage_lines)
				self.assertEqual(len(pp.get_diags()), 0)

				def test_deduplicate_basic_multi_line_warning(self):
				pp = ParseClangTidyDiagnostics()
				example_warning = [
				"/project/git/Source/kwsys/Terminal.c:53:21: warning: use of a signed integer operand with a binary bitwise operator [hicpp-signed-bitwise]",
				"int default_tty = color & kwsysTerminal_Color_AssumeTTY;",
				" ^",
				]

				pp._parse_lines(example_warning + example_warning)
				diags = pp.get_diags()

				self.assertEqual(len(diags), 1)
				self.assertEqual(
				str(diags[0]),
				"/project/git/Source/kwsys/Terminal.c:53:21: warning: use of a signed integer operand with a binary bitwise operator [hicpp-signed-bitwise]"
				"\nint default_tty = color & kwsysTerminal_Color_AssumeTTY;"
				"\n ^")

				def test_real_diags(self):
				pp = ParseClangTidyDiagnostics()
				excerpt = \
				"""/project/git/Source/kwsys/Base64.c:54:35: warning: use of a signed integer operand with a binary bitwise operator [hicpp-signed-bitwise]
				dest[0] = kwsysBase64EncodeChar((src[0] >> 2) & 0x3F);
				^
				/project/git/Source/kwsys/Base64.c:54:36: warning: use of a signed integer operand with a binary bitwise operator [hicpp-signed-bitwise]
				dest[0] = kwsysBase64EncodeChar((src[0] >> 2) & 0x3F);
				^
				/project/git/Source/kwsys/Base64.c:56:27: warning: use of a signed integer operand with a binary bitwise operator [hicpp-signed-bitwise]
				kwsysBase64EncodeChar(((src[0] << 4) & 0x30) \| ((src[1] >> 4) & 0x0F));
				^
				/project/git/Source/kwsys/Base64.c:56:28: warning: use of a signed integer operand with a binary bitwise operator [hicpp-signed-bitwise]
				kwsysBase64EncodeChar(((src[0] << 4) & 0x30) \| ((src[1] >> 4) & 0x0F));
				^
				/project/git/Source/kwsys/Base64.c:54:35: warning: use of a signed integer operand with a binary bitwise operator [hicpp-signed-bitwise]
				dest[0] = kwsysBase64EncodeChar((src[0] >> 2) & 0x3F);
				^
				/project/git/Source/kwsys/Base64.c:54:36: warning: use of a signed integer operand with a binary bitwise operator [hicpp-signed-bitwise]
				dest[0] = kwsysBase64EncodeChar((src[0] >> 2) & 0x3F);
				^
				/project/git/Source/kwsys/Base64.c:56:27: warning: use of a signed integer operand with a binary bitwise operator [hicpp-signed-bitwise]
				kwsysBase64EncodeChar(((src[0] << 4) & 0x30) \| ((src[1] >> 4) & 0x0F));
				^
				/project/git/Source/kwsys/Base64.c:56:28: warning: use of a signed integer operand with a binary bitwise operator [hicpp-signed-bitwise]
				kwsysBase64EncodeChar(((src[0] << 4) & 0x30) \| ((src[1] >> 4) & 0x0F));
				^
				/project/git/Source/kwsys/testCommandLineArguments.cxx:16:10: warning: inclusion of deprecated C++ header 'string.h'; consider using 'cstring' instead [hicpp-deprecated-headers]
				#include <string.h> /* strcmp */
				^~~~~~~~~~
				<cstring>
				/project/git/Source/kwsys/testFStream.cxx:10:10: warning: inclusion of deprecated C++ header 'string.h'; consider using 'cstring' instead [hicpp-deprecated-headers]
				#include <string.h>
				^~~~~~~~~~
				<cstring>
				/project/git/Source/kwsys/testFStream.cxx:77:47: warning: do not implicitly decay an array into a pointer; consider using gsl::array_view or an explicit cast instead [hicpp-no-array-decay]
				out.write(reinterpret_cast<const char*>(expected_bom_data[i] + 1),
				^
				/project/git/Source/kwsys/testFStream.cxx:78:18: warning: do not implicitly decay an array into a pointer; consider using gsl::array_view or an explicit cast instead [hicpp-no-array-decay]
				*expected_bom_data[i]);
				^
				/project/git/Source/kwsys/testFStream.cxx:109:3: warning: use of a signed integer operand with a binary bitwise operator [hicpp-signed-bitwise]
				ret \|= testNoFile();
				^
				/project/git/Source/kwsys/testFStream.cxx:110:3: warning: use of a signed integer operand with a binary bitwise operator [hicpp-signed-bitwise]
				ret \|= testBOM();
				^
				/project/git/Source/kwsys/testSystemInformation.cxx:83:36: warning: use of a signed integer operand with a binary bitwise operator [hicpp-signed-bitwise]
				if (info.DoesCPUSupportFeature(static_cast<long int>(1) << i)) {
				^"""
				pp.parse_string(excerpt)
				self.assertEqual(len(pp.get_diags()), 11)

				self.maxDiff = None
				generated_diag = "\n".join(str(diag) for diag in pp.get_diags())
				# It is not identical because of deduplication.
				self.assertNotEqual(generated_diag, excerpt)

				# The first 11 lines are duplicated diags but the rest is identical.
				self.assertEqual(generated_diag,
				"\n".join(excerpt.splitlines()[12:]))

				# Pretend that the next clang-tidy invocation returns its data
				# and the parser shall deduplicate this one as well. This time
				# no new data is expected.
				pp.reset_parser()
				pp.parse_string(excerpt)
				self.assertEqual(len(pp.get_diags()), 0)

				def test_log_files(self):
				pp = ParseClangTidyDiagnostics()
				pp._parse_file("test_input/out_csa_cmake.log")
				self.assertEqual(len(pp.get_diags()), 5)

				pp.reset_parser()
				pp._parse_file("test_input/out_csa_cmake.log")
				self.assertEqual(len(pp.get_diags()), 0)

				pp.reset_parser()
				pp._parse_file("test_input/out_performance_cmake.log")
				self.assertEqual(len(pp.get_diags()), 24)

				pp.reset_parser()
				pp._parse_file("test_input/out_performance_cmake.log")
				self.assertEqual(len(pp.get_diags()), 0)


				if __name__ == "__main__":
				unittest.main()

docs/ReleaseNotes.rst

Show First 20 Lines • Show All 176 Lines • ▼ Show 20 Lines	- New alias :doc:`hicpp-uppercase-literal-suffix
:doc:`readability-uppercase-literal-suffix		:doc:`readability-uppercase-literal-suffix
<clang-tidy/checks/readability-uppercase-literal-suffix>`		<clang-tidy/checks/readability-uppercase-literal-suffix>`
added.		added.

- The :doc:`readability-redundant-smartptr-get		- The :doc:`readability-redundant-smartptr-get
<clang-tidy/checks/readability-redundant-smartptr-get>` check does not warn		<clang-tidy/checks/readability-redundant-smartptr-get>` check does not warn
about calls inside macros anymore by default.		about calls inside macros anymore by default.

		- `run-clang-tidy.py` support deduplication of `clang-tidy` diagnostics
		to reduce the amount of output with the optional `-deduplicate` flag.

Improvements to include-fixer		Improvements to include-fixer
-----------------------------		-----------------------------

The improvements are...		The improvements are...

Improvements to modularize		Improvements to modularize
--------------------------		--------------------------

The improvements are...		The improvements are...