This is an archive of the discontinued LLVM Phabricator instance.

[clang-tidy] Add readability-duplicate-include check
ClosedPublic

Authored by LegalizeAdulthood on Feb 28 2015, 11:06 PM.

Details

Summary

Looks for duplicate includes and removes them.

Every time an include directive is processed, check a vector of filenames
to see if the included file has already been included. If so, it issues
a warning and a replacement to remove the entire line containing the
duplicated include directive.

When a macro is defined or undefined, the vector of filenames is cleared.
This enables including the same file multiple times, but getting
different expansions based on the set of active macros at the time of
inclusion. For example:

#undef NDEBUG
#include "assertion.h"
// ...code with assertions enabled

#define NDEBUG
#include "assertion.h"
// ...code with assertions disabled

Since macros are redefined between the inclusion of assertion.h,
they are not flagged as redundant.

Diff Detail

Event Timeline

LegalizeAdulthood retitled this revision from to Add readability-redundant-include check to clang-tidy.
LegalizeAdulthood updated this object.
LegalizeAdulthood edited the test plan for this revision. (Show Details)
LegalizeAdulthood added a reviewer: alexfh.
LegalizeAdulthood added a subscriber: Unknown Object (MLST).

Add doxygen comments to check's .h file

alexfh edited edge metadata.Mar 2 2015, 7:07 AM

I'd better put the effort in making the llvm-include-order check extensible (or configurable) enough to support custom include order, and then implement this inside the include order check. There's a TODO, BTW ;)

Will this suit your needs?

I think there is benefit to checking for redundant includes separately from include order.

On many code bases where I've worked the include order is all over the place. The main barrier in my experience to having a consistent include order hasn't been one of tools, but getting agreement from the team. For the llvm/clang code bases the team has already decided on the order of includes, but for many teams that can start a big political battle.

However, noone is going to disagree with eliminating redundant includes.

alexfh added a comment.Mar 3 2015, 7:45 AM

I think there is benefit to checking for redundant includes separately from include order.

On many code bases where I've worked the include order is all over the place. The main barrier in my experience to having a consistent include order hasn't been one of tools, but getting agreement from the team. For the llvm/clang code bases the team has already decided on the order of includes, but for many teams that can start a big political battle.

However, noone is going to disagree with eliminating redundant includes.

Makes sense. See a couple of initial comments below.

I have a feeling that some common machinery could be pulled from the llvm-include-order check which could benefit this check as well. I'll try to come up with a specific idea later.

clang-tidy/readability/CMakeLists.txt
10 ↗(On Diff #20953)

I'm not a native speaker, but I have a feeling that "duplicate" is a more suitable word here than "redundant". WDYT?

clang-tidy/readability/RedundantInclude.h
22 ↗(On Diff #20953)

This description is somewhat confusing. Maybe something along the lines of "Only consecutive #include directives without any other preprocessor directives between them are analyzed"?

clang-tidy/readability/CMakeLists.txt
10 ↗(On Diff #20953)

Fixed.

clang-tidy/readability/RedundantInclude.h
22 ↗(On Diff #20953)

Fixed.

LegalizeAdulthood retitled this revision from Add readability-redundant-include check to clang-tidy to Add readability-duplicate-include check to clang-tidy.
LegalizeAdulthood edited edge metadata.

Update check description in header file.
Rename check class to DuplicateIncludeCheck
Rename check to readability-duplicate-include

alexfh added inline comments.Mar 9 2015, 10:30 AM
clang-tidy/readability/DuplicateIncludeCheck.cpp
62 ↗(On Diff #21439)

What's the reason to limit the check to the main file only? I think, it should work on all headers as well. Also, sometimes it's fine to have duplicate includes even without macro definitions in between, e.g. when these #includes are in different namespaces.

I'd suggest using the same technique as in the IncludeOrderCheck: for each file collect all preprocessor directives sorted by SourceLocation. Then detect #include blocks (not necessarily the same way as its done in the IncludeOrderCheck. Maybe use the presense of any non-comment tokens between #includes as a boundary of blocks), and detect duplicate includes in each block.

clang-tidy/readability/DuplicateIncludeCheck.cpp
62 ↗(On Diff #21439)

If I remove the isInMainFile, how do I ensure that I don't attempt to modify system headers?

alexfh added inline comments.Mar 15 2015, 5:08 PM
clang-tidy/readability/DuplicateIncludeCheck.cpp
62 ↗(On Diff #21439)

Using SourceLocation::isInSystemHeader().

clang-tidy/readability/DuplicateIncludeCheck.cpp
62 ↗(On Diff #21439)

Thanks, I'll try that. The next question that brings to mind is how do I distinguish between headers that I own and headers used from third party libraries?

For instance, suppose I run a check on a clang refactoring tool and it uses isInSystemHeader and starts flagging issues in the clang tooling library headers.

The compile_commands.json doesn't contain any information about headers in my project, only translation units in my build, so it doesn't know whether or not included headers belong to me or third-party libraries.

clang-tidy/readability/DuplicateIncludeCheck.cpp
62 ↗(On Diff #21439)

For the benefit of others reading this, Alex pointed out to me the -header-filter and -system-headers options to clang-tidy. I think this means I don't need any narrowing if isExpansinInMainFile() in any of my matchers. I will do some experimenting to verify that this doesn't introduce regressions.

sbenza added inline comments.Mar 18 2015, 1:46 PM
clang-tidy/readability/DuplicateIncludeCheck.cpp
23 ↗(On Diff #21439)

Function names start with lowercase.

28 ↗(On Diff #21439)

This seems very wasteful, given that the SourceManager could tell you (if there was an API) the location for the next line in O(1) time.
Don't know if it is worth enough to add this API to SourceManager.

53 ↗(On Diff #21439)

These should not end with _

clang-tidy/readability/DuplicateIncludeCheck.cpp
28 ↗(On Diff #21439)

I thought it should have been easy to say "get me the source range that spans the line containing this #include directive", but I didn't find anything easier that worked properly for my test cases. I'm in the process of refactoring this check to share some common code with llvm-include-order and that might eliminate the need for this, but I haven't gotten that far yet.

sbenza added inline comments.Mar 19 2015, 10:34 AM
clang-tidy/readability/DuplicateIncludeCheck.cpp
28 ↗(On Diff #21439)

I'm ok with the way it is.
It is only being used when making a warning so it should not cost anything on clean code.
It was just a general comment.
Maybe adding a comment here would be good in case someone comes looking for a speedup opportunity.

alexfh added a comment.Nov 5 2015, 1:28 PM

What's the state of this patch?

I need to update from review comments and upload a new diff.

alexfh requested changes to this revision.Mar 18 2016, 3:55 AM
alexfh edited edge metadata.
This revision now requires changes to proceed.Mar 18 2016, 3:55 AM
clang-tidy/readability/DuplicateIncludeCheck.cpp
62 ↗(On Diff #21439)

If I remove the isInMainFile and replace it with a check to isInSystemHeader, then all my tests fail. I will have to investigate this further.

Include What You Use detect duplicated include directives. I think will be good idea to use it instead of Clang-tidy for much deeper analysis.

Review takes too long to make forward progress; abandoning.

LegalizeAdulthood reclaimed this revision.Jan 2 2022, 8:52 AM
LegalizeAdulthood marked 6 inline comments as done.
This revision now requires changes to proceed.Jan 2 2022, 8:52 AM
LegalizeAdulthood marked an inline comment as done.

Revive review with updated diff on top-of-tree

Herald added a project: Restricted Project. · View Herald TranscriptJan 2 2022, 8:54 AM

Functionality-wise this check is superseded by Include What You Use.

clang-tools-extra/docs/clang-tidy/checks/readability-duplicate-include.rst
7

Please fix double spaces. Same below.

LegalizeAdulthood retitled this revision from Add readability-duplicate-include check to clang-tidy to [clang-tidy] Add readability-duplicate-include check.Jan 2 2022, 9:22 AM

Move included headers to clang-tools-extra/test/clang-tidy/checkers/Inputs/readability-duplicate-include
Stub out included headers to isolate from system headers
Run test for C++98 or later

Functionality-wise this check is superseded by Include What You Use.

That's not part of clang-tidy and has it's own set of problems (false positives) when you run it.

clang-tools-extra/docs/clang-tidy/checks/readability-duplicate-include.rst
7

Please point me to a LLVM coding standards section that requires this.

LegalizeAdulthood marked an inline comment as done.

clang-format

Ping.

This review has been waiting for a week without any comment.

2nd. ping.

Still waiting for a review on this check.

Ping.

Still waiting for a review on this check.

The implementation isn't particularly long or complicated,
so it should not take much time to review.

Thanks.

Looks good! I need to familiarize myself better with the Loc manipulation code to be able to review the free-standing function but otherwise looks reasonable.

clang-tools-extra/clang-tidy/readability/DuplicateIncludeCheck.cpp
23

Move out of anonymous namespace and use static, as per LLVM guidelines.

65

I'm not familiar with isInMainFile, will the check work if the duplicated include is in a header included by the main .cpp file? Should a test be added for that?

clang-tools-extra/docs/clang-tidy/checks/readability-duplicate-include.rst
35–48

As an external user that is not deeply involved in the details of LLVM, I find this example rather complicated. Would it be possible to use a simpler example? What about this one with cassert

#undef NDEBUG
#include <cassert>

// ... code with assertions enabled

#define NDEBUG
#include <cassert>

// ... code with assertions disabled

There's also a negative vote from @alexfh , what about that?

clang-tools-extra/clang-tidy/readability/DuplicateIncludeCheck.cpp
65

Good catch. I originally wrote this check in 2015 (yes, that's
how long it's been waiting for a review) and recently revived it.
At the time, I didn't understand that checks could apply fixits to
headers and that user headers could be differentiated from system
headers, so this is a mistake.

clang-tools-extra/docs/clang-tidy/checks/readability-duplicate-include.rst
35–48

Good idea and I like your example better.

! In D7982#3258888, @carlosgalvezp wrote:
I need to familiarize myself better with the Loc manipulation
code to be able to review the free-standing function but otherwise
looks reasonable.

The "clang internals" documentation does a good job of explaining the
location classes:
https://clang.llvm.org/docs/InternalsManual.html#the-sourcelocation-and-sourcemanager-classes

This reminds me that I've been thinking we need to write some
documentation for clang-tidy check authors to help orient them
on the various parts involved in writing a check.

LegalizeAdulthood marked 3 inline comments as done.
  • Properly apply fixes to header files
  • Add tests for fixes in header files
  • Move function to static outside anonymous namespace
  • Improve documentation

@carlosgalvezp I believe I've addressed your comments now.
Thanks for the review, it definitely improved this check.

LegalizeAdulthood edited the summary of this revision. (Show Details)Jan 20 2022, 6:26 PM
LegalizeAdulthood marked an inline comment as done.
carlosgalvezp accepted this revision.Jan 20 2022, 11:26 PM

LGTM! Had some nits that can be fixed without review.

Can't see anything else major that needs change. As said I'm fairly new here so probably a more experienced reviewer can find some more improvements/optimizations, especially on the Loc aspects. On the other hand I think after 6 years it's about time to get this in, it can always be improved afterwards :)

clang-tools-extra/clang-tidy/readability/DuplicateIncludeCheck.cpp
85

Nit: as a n00b I would have appreciated a quick comment about why we can't directly use the HashLoc and FilenameRange as Start and End locations. Took me a while to understand this is to cover these cases:

#include "foo.h" -> needs updated Start
#include "bar.h" // Bar is needed! -> needs updated End

clang-tools-extra/test/clang-tidy/checkers/Inputs/readability-duplicate-include/readability-duplicate-include.h
2

Nit: #ifndef

clang-tools-extra/test/clang-tidy/checkers/Inputs/readability-duplicate-include/system/sys/types.h
2

Nit: write "This file is intentionally empty" for consistency with the other files you added? It's already clear from the folder structure that this file is an input for the readability-duplicate-include test.

clang-tools-extra/test/clang-tidy/checkers/readability-duplicate-include.cpp
20

Nit: it would be good to keep [readability-duplicate-include] part of the warning consistent - on Line 8 you have it but not on the other lines. So either keep it for all warnings or remove it from line 8.

clang-tools-extra/test/clang-tidy/checkers/Inputs/readability-duplicate-include/readability-duplicate-include.h
2

It's a style thing. I don't prefer #ifndef because it only differs from #ifdef by a single letter.

clang-tools-extra/test/clang-tidy/checkers/Inputs/readability-duplicate-include/system/sys/types.h
2

Oops, yeah, meant to do that with all of them and missed this one.
When I first authored this check 7 years ago, they didn't have the Inputs
folder.

clang-tools-extra/test/clang-tidy/checkers/readability-duplicate-include.cpp
20

CHECK-MESSAGES will match whatever substring you provide. To make
the test file less noisy, I do a full match on the entire diagnostic for the
first warning and then just the necessary bits for the remaining checks.

LGTM! Had some nits that can be fixed without review.

I think you need to do "Accept Revision" in phabricator.
It's in the "Add Action..." dropdown on the bottom of the
review page.

  • Update from review comments
carlosgalvezp requested changes to this revision.Jan 21 2022, 11:37 PM
carlosgalvezp added inline comments.
clang-tools-extra/test/clang-tidy/checkers/Inputs/readability-duplicate-include/readability-duplicate-include.h
2

You make a very valid point and I would definitely agree with that for "regular" code.

However we are talking about a very special case here - header guards. Header guards have a de-facto standard based on #ifndef/#define. It's actually less error prone to write it like that - a well-written header guard will have perfect visual alignment:

#ifndef FOO_H
#define FOO_H

If you missed the n in #ifndef, you would notice the misalignment immediately and know what to fix.

#ifdef FOO_H
#define FOO_H

This also helps making sure that the macro name is identical on both lines. This visual alignment is broken when using #if !defined, which makes it harder to detect these problems.

Furthermore, I have run a quick search in the whole repo and I cannot find one single instance where a header guard is written in the way that you propose here.

$ git grep -E "#ifndef[ ]+[A-Z_]+_H" | wc -l
9573

$ git grep -E "#if[ ]+\!defined\([A-Z_]+_H\)" | wc -l
7

All such 7 instances are exclusively used for error handling or other logic, not for defining header guards.

Therefore I don't see this as a style choice, but rather a repo-wide convention that shall be followed, and so I feel it's my duty as reviewer to request changes.

I understand this is annoying after 7 years waiting for a patch, but I think it's a very easy fix to do. I can approve the patch right away after this if fixed. Thank you for your patience! :)

This revision now requires changes to proceed.Jan 21 2022, 11:37 PM
LegalizeAdulthood marked an inline comment as done.

Update from review comments

clang-tools-extra/test/clang-tidy/checkers/Inputs/readability-duplicate-include/readability-duplicate-include.h
2

When it comes to matters that I consider a style preference, I
always prioritize the style of the team/project/code base over
my own style preferences unless I'm the only person working
on the existing code base :)

carlosgalvezp accepted this revision.Jan 22 2022, 12:22 PM

Great patch, thanks!!

Great patch, thanks!!

Can you mark the patch as accepted in phabricator?
Thanks

carlosgalvezp added a comment.EditedJan 22 2022, 4:39 PM

Great patch, thanks!!

Can you mark the patch as accepted in phabricator?
Thanks

I did do that and it shows that I have accepted the revision, what else should I do?

Or is it because of the negative vote from @alexfh ? Since he's not replying perhaps you can remove him from the reviewer list?

Great patch, thanks!!

Can you mark the patch as accepted in phabricator?
Thanks

I did do that and it shows that I have accepted the revision, what else should I do?

Or is it because of the negative vote from @alexfh ? Since he's not replying perhaps you can remove him from the reviewer list?

Ah yes, I see now, it's in the history that you accepted the patch.
I'm not sure how to clear the "needs revision" status from Alex
years ago, so I'm going to submit this.

This revision is now accepted and ready to land.Jan 23 2022, 7:09 AM

OK, removing Alex has cleared it.

Thanks for this, it found 74 occurrences of this in the firefox code base
https://hg.mozilla.org/mozilla-central/rev/b07f125d09fd