This is an archive of the discontinued LLVM Phabricator instance.

[clang-tdiy] Add header file extension configuration support.
ClosedPublic

Authored by hokein on Jan 12 2016, 6:20 AM.

Details

Summary
  • Add a HeaderFileExtensions check option in misc-definitions-in-headers, google-build-namespaces and google-global-names-in-headers.

Diff Detail

Event Timeline

hokein updated this revision to Diff 44633.Jan 12 2016, 6:20 AM
hokein retitled this revision from to [clang-tdiy] Add header file extension configuration support..
hokein updated this object.
hokein updated this object.Jan 12 2016, 6:24 AM
hokein added reviewers: alexfh, aaron.ballman.
hokein set the repository for this revision to rL LLVM.
hokein added a subscriber: cfe-commits.
aaron.ballman edited edge metadata.Jan 12 2016, 2:18 PM

Public-facing documentation should also be updated to specify the new option for the various checkers.

aaron.ballman added inline comments.Jan 12 2016, 2:18 PM
clang-tidy/ClangTidy.h
63

"in both options" -> "in either list"

65

Any reason for Default to not be a StringRef (I assume this is always going to be a hard-coded string literal in callers)? At the very least, it should be a const std::string &.

clang-tidy/ClangTidyOptions.cpp
269 ↗(On Diff #44633)

This code seems like it will fail if there are spaces in the extension list, like .h, .hxx instead of .h,.hxx. I think the list, as given by the end user, should be processed into a vector of sorts so that things can be normalized (entries without a period can be diagnosed, spaces removed, etc).

270 ↗(On Diff #44633)

clang-format the patch.

clang-tidy/ClangTidyOptions.h
216 ↗(On Diff #44633)

endsWithHeaderFileExtension instead? However, given that uses of this all start with a SourceLocation, I wonder if that makes for a cleaner API: isLocInHeaderFile(SourceLocation, <Extensions>);

Also, how does this work if I want to include an extension-less file as the header file "extension?" It would be plausible if the extensions were passed in as a list, but as it stands it doesn't seem possible without weird conventions like leaving a blank in the list (e.g., .h,,.hpp), which seems error-prone.

Also, I'm not certain what I can pass in. The documentation should be updated to state whether these extensions are intended to include the ".".

hokein added inline comments.Jan 12 2016, 6:13 PM
clang-tidy/ClangTidyOptions.h
216 ↗(On Diff #44633)

endsWithHeaderFileExtension instead? However, given that uses of this all start with a SourceLocation, I wonder if that makes for a cleaner API: isLocInHeaderFile(SourceLocation, <Extensions>);

Using SourceLocation only is not enough to retrieve the belonging file name (we need SourceManager too).

Also, how does this work if I want to include an extension-less file as the header file "extension?" It would be plausible if the extensions were passed in as a list, but as it stands it doesn't seem possible without weird conventions like leaving a blank in the list (e.g., .h,,.hpp), which seems error-prone.

Yeah, for extensions-less header file, you can pass the string like .h,,.hpp, which is a bit of weird. Do you have a better idea here? Passing a string into header-file-extensions seems the most reasonable choice.

alexfh edited edge metadata.Jan 13 2016, 3:10 AM

Thank you for the patch! I have a few concerns though. See inline comments.

clang-tidy/ClangTidy.cpp
348

There's no need to declare a vector: for (const string &s : {"a", "b", "c"}) ....

Also, I don't think it makes sense to create a vector of two elements and run a loop over it instead of duplicating three lines of code.

clang-tidy/ClangTidy.h
62

We need to explicitly specify the order in which the options are considered: local, global, default.

clang-tidy/ClangTidyOptions.cpp
269 ↗(On Diff #44633)

It's rather inefficient to split the string each time the function is called. If even we needed this function, we would better pass an ArrayRef<StringRef>. But even better, we can use llvm::StringSet and get rid of this function at all (we might need a helper function to parse a comma-separated list of extensions to a StringSet though).

clang-tidy/ClangTidyOptions.h
216 ↗(On Diff #44633)

This function doesn't belong here. I'm also not sure we need this function at all. First, it's ineffective to split the string containing the list of extensions each time. Second, if we store a set of extensions, then we can just search for the actual file extension in this set.

clang-tidy/google/GlobalNamesInHeadersCheck.h
24

nit: s/There is one option:/The check supports these options:/

25

s/the file extensions that regard as header file/a comma-separated list of filename extensions of header files/

I don't think this list should include the period. It's better to just store extensions as returned by llvm::sys::path::extension(), so that it can be simply looked up in a set.

Also, it's common to indent lists, so please insert two spaces at the beginning of this line and the next one.

37

As mentioned earlier, this can be a set of strings (e.g. llvm::StringSet).

clang-tidy/misc/DefinitionsInHeadersCheck.cpp
21–22

s/useHeaderFileExtension/usesHeaderFileExtension/

clang-tidy/tool/ClangTidyMain.cpp
78 ↗(On Diff #44633)

We don't need a command-line flag for this. The regular way to configure clang-tidy is .clang-tidy files. However, if needed, check options can be configured from the command line via the -config= option.

alexfh added inline comments.Jan 13 2016, 3:16 AM
clang-tidy/ClangTidyOptions.h
216 ↗(On Diff #44633)

isLocInHeaderFile(SourceLocation, ...) is a nice idea, but we'd need to be more specific: either isExpansionLocInHeaderFile(SourceLoc, ...) or isSpellingLocInHeaderFile(SourceLoc, ...) (or both).

hokein added inline comments.Jan 13 2016, 4:36 PM
clang-tidy/tool/ClangTidyMain.cpp
78 ↗(On Diff #44633)

From my understanding, it is a global option for header-file-extensions, right? If we remove this command-line flag, there is only local header-file-extensions option remained for particular checks.

alexfh added inline comments.Jan 14 2016, 5:22 AM
clang-tidy/tool/ClangTidyMain.cpp
78 ↗(On Diff #44633)

Both local and global options reside in CheckOptions and can be configured in the same way using either a .clang-tidy file or the -config= command-line option. The only difference between local and global options is that the name of the latter is not prefixed with the "CheckName.". Makes sense?

aaron.ballman added inline comments.Jan 14 2016, 7:02 AM
clang-tidy/ClangTidyOptions.h
216 ↗(On Diff #44633)

Yeah, for extensions-less header file, you can pass the string like .h,,.hpp, which is a bit of weird. Do you have a better idea here? Passing a string into header-file-extensions seems the most reasonable choice.

I thought those user configurations from the command line were in YAML or JSON format, those both have the notion of lists, don't they? I would imagine this would take a SmallVectorImpl<StringRef/std::string> for the list of extensions.

216 ↗(On Diff #44633)

isLocInHeaderFile(SourceLocation, ...) is a nice idea, but we'd need to be more specific: either isExpansionLocInHeaderFile(SourceLoc, ...) or isSpellingLocInHeaderFile(SourceLoc, ...) (or both).

That's true, and I would think both are reasonable to add. I rather prefer that as an API instead of passing around file names as strings, personally.

clang-tidy/tool/ClangTidyMain.cpp
78 ↗(On Diff #44633)

Both local and global options reside in CheckOptions and can be configured in the same way using either a .clang-tidy file or the -config= command-line option. The only difference between local and global options is that the name of the latter is not prefixed with the "CheckName.". Makes sense?

Ah, thank you for that explanation. I learned something new today. ;-)

alexfh added inline comments.Jan 14 2016, 7:16 AM
clang-tidy/ClangTidyOptions.h
216 ↗(On Diff #44633)

User configurations are stored in YAML, however, CheckOptions is a map of strings to strings, so we can't use YAML lists to represent lists of extensions.
I personally see nothing wrong with ",.h,.hh,.hpp" for example, to represent the list of extensions (<empty> being the first one makes it somewhat stand out and provided there's a good documentation, this shouldn't be too confusing).

aaron.ballman added inline comments.Jan 14 2016, 7:25 AM
clang-tidy/ClangTidyOptions.h
216 ↗(On Diff #44633)

I personally see nothing wrong with ",.h,.hh,.hpp" for example, to represent the list of extensions (<empty> being the first one makes it somewhat stand out and provided there's a good documentation, this shouldn't be too confusing).

I find it to be more clever than intuitive. If it were not user-facing, I would be less concerned. I don't think users should have to read documentation to figure out the *syntax* of how to pass options if we can at all avoid it. ;-)

Regardless, I would like to separate the two concepts -- there's the way we expose the option to the users, and there's our internal APIs that we call. I don't think the internal API has to take such an awkward thing directly just because the user-facing option has to be that way currently.

alexfh added inline comments.Jan 14 2016, 8:18 AM
clang-tidy/ClangTidyOptions.h
216 ↗(On Diff #44633)

This aligns with what I wrote in my first comment on this:

This function doesn't belong here. I'm also not sure we need this function at all. First, it's ineffective to split the string containing the list of extensions each time. Second, if we store a set of extensions, then we can just search for the actual file extension in this set.

I still think the function doesn't belong here. And we also shouldn't split the string each time we call it. So I suggest adding these functions to clang-tidy/utils/:

bool isExpansionLocInHeader(const? SourceManager &SM, SourceLocation Loc, const StringSet &HeaderFileExtensions);
bool isSpellingLocInHeader(const? SourceManager &SM, SourceLocation Loc, const StringSet &HeaderFileExtensions);
StringSet splitCommaSeparatedSet(StringRef S); // Or something similar.
hokein updated this revision to Diff 44951.Jan 14 2016, 5:13 PM
hokein edited edge metadata.

Address review comments.

hokein marked 11 inline comments as done.Jan 14 2016, 5:19 PM
hokein added inline comments.
clang-tidy/ClangTidy.h
65

Makes sense. I just refer from std::string get(StringRef LocalName, std::string Default) const. Also change get to use StringRef.

clang-tidy/tool/ClangTidyMain.cpp
78 ↗(On Diff #44633)

I see it.

hokein updated this revision to Diff 45006.Jan 15 2016, 9:58 AM
hokein marked an inline comment as done.

Add extension-less header file doc.

aaron.ballman added inline comments.Jan 15 2016, 10:03 AM
clang-tidy/utils/HeaderFileExtensionsUtils.cpp
49

One thing we may want to reconsider for this function is using comma as a delimiter. We have another in-flight review which has to use semi-colons for a checker option list because commas are part of the syntax for a type description (think Foo<A, B>). It would be really nice to have one utility for parsing lists of information from checker options, and for that to use the same delimiter everywhere so that users aren't confused by using commas in some places and semi-colons in others.

hokein marked 2 inline comments as done.Jan 15 2016, 10:03 AM
hokein added inline comments.
clang-tidy/ClangTidyOptions.cpp
269 ↗(On Diff #44633)

Now it's tolerant of whitespace around the commas.

hokein updated this revision to Diff 45077.Jan 16 2016, 8:05 AM
hokein marked an inline comment as done.

Format code style.

aaron.ballman added inline comments.Jan 19 2016, 8:04 AM
clang-tidy/google/GlobalNamesInHeadersCheck.h
26

s/includ/include

Is "." tolerated in the extension? It sounds like you can write: h,.hpp,hxx and it will work? Best to clarify the requirements (here and elsewhere). I don't have a strong opinion on whether the extension should include the full stop or not.

28

What if you only want to match extensionless header files?

clang-tidy/google/UnnamedNamespaceInHeaderCheck.h
25

s/includ/include

Same comment about use of ".".

clang-tidy/misc/DefinitionsInHeadersCheck.h
27

s/includ/include

Also, same comments about the full stop.

hokein updated this revision to Diff 45349.Jan 19 2016, 9:58 PM

Update comments.

hokein marked 3 inline comments as done.Jan 19 2016, 10:01 PM
hokein added inline comments.
clang-tidy/google/GlobalNamesInHeadersCheck.h
26

The "." is not tolerated in the extension. Having updated comments.

28

As the comment states, using an empty string "" only match extension-less header files.

Mostly looks good, except I do still have the question about list delimiters that I'd like to hear some thoughts on. My primary concern is that I hope to avoid having different separators for different lists, and I wonder whether we want a common "parse a list of stuff" interface for the options that then gets used by functions like parseHeaderFileExtensions() to do sanitization of the list as-needed.

hokein marked an inline comment as done.Jan 22 2016, 9:51 AM
hokein added inline comments.
clang-tidy/utils/HeaderFileExtensionsUtils.cpp
49

I'm +1 on keeping the consistence of list option.

I also check the option in clang-tidy checks, seems that this is the first time to introduce list option into clang-tidy. So ; is fine to me.

hokein updated this revision to Diff 45852.Jan 25 2016, 3:52 AM

Change delimeter from "," to ";".

aaron.ballman accepted this revision.Jan 25 2016, 7:36 AM
aaron.ballman edited edge metadata.

LGTM, thank you for working on this!

This revision is now accepted and ready to land.Jan 25 2016, 7:36 AM
hokein updated this object.Jan 26 2016, 1:27 AM
hokein edited edge metadata.
alexfh added a comment.Feb 3 2016, 8:25 AM

Sorry for the delay. A few more comments.

clang-tidy/google/GlobalNamesInHeadersCheck.cpp
30

That's not the best way to report errors, especially, when clang-tidy is used as a library. There are multiple possibilities, e.g. add a callback to ClangTidyContext (or a method to ClangTidyCheck) to report a problem with configuration, or move to two-stage initialization of checks, or something similar. No need to change this now, but please add a FIXME to find a more suitable way to handle invalid configuration options.

59

No need for the parentheses around Result.SourceManager.

clang-tidy/utils/HeaderFileExtensionsUtils.cpp
51

seems that this is the first time to introduce list option into clang-tidy.

Comma is used as a delimiter in two more checks' options: clang-tidy/modernize/UseNullptrCheck.cpp and clang-tidy/misc/AssertSideEffectCheck.cpp. It's also used on the command line for the -checks and -warnings-as-errors options. I'm not sure what to do here, if we want consistency. For now, I'd rather switch back to comma and maybe provide a more generic facility to parse lists from clang-tidy options, which could also handle both separators in a smart way (e.g. when the list contains semicolons, use only them - this will allow listing entries like "A<B,T>", otherwise use comma; maybe also support some other separator, say | for cases when we need to list entries containing semicolons).

clang-tidy/utils/HeaderFileExtensionsUtils.h
24

That's an unnecessary long and too specific namespace name. I'd go for just utils, at least for now.

34

Don't add unused functions. When someone needs them, they can add them.

alexfh requested changes to this revision.Feb 3 2016, 8:25 AM
alexfh edited edge metadata.
This revision now requires changes to proceed.Feb 3 2016, 8:25 AM
hokein updated this revision to Diff 46897.Feb 4 2016, 4:49 AM
hokein edited edge metadata.

Address Alex's comments.

hokein marked 3 inline comments as done.Feb 4 2016, 4:55 AM
hokein added inline comments.
clang-tidy/utils/HeaderFileExtensionsUtils.cpp
52

Thanks for the explanation.
Now I add delimiter parameter to this function to allow developers to set specified character.

clang-tidy/utils/HeaderFileExtensionsUtils.h
35

It is used in UnnamedNamespaceInHeaderCheck.

alexfh accepted this revision.Feb 4 2016, 6:39 AM
alexfh edited edge metadata.

Looks good with a couple of comments. Thanks for working on this!

clang-tidy/utils/HeaderFileExtensionsUtils.cpp
46

No need for llvm:: here.

47

Actually, this could be changed to a more generic parseStringSetOption or something similar. Let's do this in a follow up.

But for now we need to change the output parameter type to llvm::SmallSetImpl<StringRef> to avoid hardcoding the small size template argument. The typedef is then not useful anymore.

clang-tidy/utils/HeaderFileExtensionsUtils.h
35

Ah, missed this somehow. Then it's fine. I'm not sure though, whether PresumedLocation is the right thing in that check, but we can figure this out later.

This revision is now accepted and ready to land.Feb 4 2016, 6:39 AM
alexfh added inline comments.Feb 4 2016, 6:45 AM
clang-tidy/misc/DefinitionsInHeadersCheck.cpp
41

It's common to use just FIXME:, without the reference to the author.

hokein added inline comments.Feb 4 2016, 7:19 AM
clang-tidy/utils/HeaderFileExtensionsUtils.cpp
47

But for now we need to change the output parameter type to llvm::SmallSetImpl<StringRef> to avoid hardcoding the small size template argument. The typedef is then not useful anymore.

I can't find llvm::SmallSetImpl or any similar Set without size template argument.

alexfh added inline comments.Feb 4 2016, 7:41 AM
clang-tidy/utils/HeaderFileExtensionsUtils.cpp
47

Hm, indeed SmallSet has no base class unlike SmallVector, for example. Ok, then seems good for now.

hokein updated this revision to Diff 46923.Feb 4 2016, 8:17 AM
hokein edited edge metadata.

Update.

hokein marked 2 inline comments as done.Feb 4 2016, 8:21 AM
alexfh added a comment.Feb 5 2016, 2:14 AM

Still looks good.

This revision was automatically updated to reflect the committed changes.
alexfh added a comment.Feb 5 2016, 4:31 AM

A couple of post-commit comments.

clang-tools-extra/trunk/clang-tidy/google/GlobalNamesInHeadersCheck.h
24 ↗(On Diff #47007)

BTW, this belongs to the user-facing doc, so please move this to the .rst file. Same for other checks.

clang-tools-extra/trunk/clang-tidy/utils/HeaderFileExtensionsUtils.cpp
53 ↗(On Diff #47007)

BTW, you should be able to use a range-based for loop.