This is an archive of the discontinued LLVM Phabricator instance.

[clang-tidy] Introduce HeaderFileExtensions and ImplementationFileExtensions options
ClosedPublic

Authored by carlosgalvezp on Jan 4 2023, 9:10 AM.

Details

Summary

We have a number of checks designed to analyze problems
in header files only, for example:

bugprone-suspicious-include
google-build-namespaces
llvm-header-guard
misc-definitions-in-header
...

All these checks duplicate the same logic and options
to determine whether a location is placed in the main
source file or in the header. More checks are coming
up with similar requirements.

Thus, to remove duplication, let's move this option
to the top-level configuration of clang-tidy (since
it's something all checks should share).

Since the checks fetch the option via getLocalOrGlobal,
the behavior is unchanged.

Add a deprecation notice for all checks that use the
local option, prompting to update to the global option.

The functionality for parsing the option will need to
remain in the checks during the transition period.
Once the local options are fully removed, the goal
is to store the parsed options in the ClangTidyContext,
that checks can easily have access to.

Diff Detail

Event Timeline

carlosgalvezp created this revision.Jan 4 2023, 9:10 AM
Herald added a project: Restricted Project. · View Herald Transcript
carlosgalvezp requested review of this revision.Jan 4 2023, 9:10 AM
Herald added a project: Restricted Project. · View Herald TranscriptJan 4 2023, 9:10 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript

Looks OK for me, but probably will be good idea to remove check-specific options in same commit.

Looks OK for me, but probably will be good idea to remove check-specific options in same commit.

Thanks for the review! Sure, I can do that, I just thought we should try to keep commits as minimal as possible. I also wonder if we need to follow a formal deprecation process or we can just remove the option from existing checks directly?

@njames93 , as Code Owner, could you give your view on the general direction of the patch before I move forward? Thanks!

Can you explain the reasoning of why this approach is better than current approach where checks can use global options(Options.getLocalOrGlobal("HeaderFileExtensions", utils::defaultHeaderFileExtensions())) to access the same information?

Some checks also have the global option ImplementationFileExtensions, what's the proposal with that. Should checks that use this functionality carry on like that, use a negative match of this proposed option or should we also add that option to the ClangTidyOptions.

What is the migration plan for code-bases that currently use the old check-based option? We can't break their existing configs, likewise we have to be careful of projects with checked-in .clang-tidy files that make use of this new option when maintainers may still have an older version of clang-tidy.

Finally, Why would you make the option a semicolon delimited string. The only reason that's used for the current CheckOptions is there isn't currently a better alternative, For ClangTidyOptions we could set the type to be an Optional<std::vector<std::string>>. This would be more natural when writing the yaml file.

HeaderFileExtensions: [ h, hh, hxx, hpp, "" ]
# Or
HeaderFileExtensions:
  - h
  - hh
  - hxx
  - hpp
  - ""
carlosgalvezp added a comment.EditedJan 5 2023, 10:53 AM

Thanks for the feedback!

Can you explain the reasoning of why this approach is better than current approach where checks can use global options(Options.getLocalOrGlobal("HeaderFileExtensions", utils::defaultHeaderFileExtensions())) to access the same information?

Nice, I wasn't aware of the getLocalOrGlobal function. It's unclear to me why some arguments are available via both command line and config, and others only via config. If there is no use case for CLI I agree the config-only option is sufficient, but I would like to properly document it. There's for example the option User: user that is undocumented. Perhaps I can restructure a bit the section Configuration files?

Some checks also have the global option ImplementationFileExtensions, what's the proposal with that. Should checks that use this functionality carry on like that, use a negative match of this proposed option or should we also add that option to the ClangTidyOptions.

For consistency I believe it makes sense to add it as well as a global option.

What is the migration plan for code-bases that currently use the old check-based option? We can't break their existing configs, likewise we have to be careful of projects with checked-in .clang-tidy files that make use of this new option when maintainers may still have an older version of clang-tidy.

We would follow a deprecation strategy. For example, we document in every check that the current options are deprecated, and users should instead use the global config. The local config can be removed after 2 clang releases (that would mean clang 18 here). Using getLocalOrGlobal should simplify the logic of supporting the transition I believe.

Finally, Why would you make the option a semicolon delimited string. The only reason that's used for the current CheckOptions is there isn't currently a better alternative, For ClangTidyOptions we could set the type to be an Optional<std::vector<std::string>>. This would be more natural when writing the yaml file.

Absolutely! The current format is very strange to me, but I kept it to keep the patch as minimal as possible. A vector of strings looks great.

carlosgalvezp added a comment.EditedJan 6 2023, 2:29 AM

On a second thought, using getLocalOrGlobal would still require all checks to have 2 private member variables (the raw string option and the parsed header option), together with the logic in the constructor to get the option an parse that, as well as functions to storeOptions.

That's a lot of duplication that I would like to get rid of. Keeping those private variables to get a global option is also a bit inconsistent with the rest of checks and might cause confusion. What do you think?

I found now what this global User option is about:

/// Specifies the name or e-mail of the user running clang-tidy.
///
/// This option is used, for example, to place the correct user name in TODO()
/// comments in the relevant check.
std::optional<std::string> User;

So it seems to me it's an option used only in one check, and yet it's a global option. Based on that I think it's reasonable to have HeaderFileExtensions and ImplementationFileExtensions (which are shared among multiple checks) as global as well. It doesn't need to be a CLI option, config-only is fine.

You do raise a good point about duplication, Therefore does it make sense to also do the same with the IncludeStyle as lots of checks add new includes.
Though there is a precedent to do away with LLVM/Google style, as clang-format should be responsible for reordering inserted includes.

I found now what this global User option is about:

/// Specifies the name or e-mail of the user running clang-tidy.
///
/// This option is used, for example, to place the correct user name in TODO()
/// comments in the relevant check.
std::optional<std::string> User;

So it seems to me it's an option used only in one check, and yet it's a global option. Based on that I think it's reasonable to have HeaderFileExtensions and ImplementationFileExtensions (which are shared among multiple checks) as global as well. It doesn't need to be a CLI option, config-only is fine.

Yeah, I think there is definitely precedent to move that field out of the Options. IMO the google-readability-todo check should instead read that field from the Check GlobalOptions, and its default value should be set in the GoogleModule::getModuleOptions virtual function.

Therefore does it make sense to also do the same with the IncludeStyle as lots of checks add new includes.

Sure, that could also be done, though I think it'd be preferable to do it in a separate patch. Personally I'm a bit reluctant to having formatting options in clang-tidy at all, since that is responsibility of clang-format - then a project has duplicated settings in both tools, and it gets tricky when they get in conflict. If a project cares about formatting they would most likely have clang-format as a pre-commit check anyway to take care of formatting after clang-tidy has fixed the rest. But if people want to keep the option I won't oppose :)

  • Add also ImplementationFileExtensions.
  • Add deprecation notice to affected checks.
carlosgalvezp retitled this revision from [clang-tidy] Introduce HeaderFileExtensions option to [clang-tidy] Introduce HeaderFileExtensions and ImplementationFileExtensions options.
carlosgalvezp edited the summary of this revision. (Show Details)

Update commit message

It'll be good idea to check that HeaderFileExtensions and ImplementationFileExtensions do not overlap.

This comment was removed by carlosgalvezp.

Since we are at it - are we happy with the name "ImplementationFileExtensions"? I think "SourceFileExtensions" is more common in the literature, and it would look more visually pleasing in the config file since it has the same length as "HeaderFileExtensions".

HeaderFileExtensions: ["h", "hh"]
ImplementationFileExtensions: ["c", "cc"]

vs

HeaderFileExtensions: ["h", "hh"]
SourceFileExtensions: ["c", "cc"]

Add check to ensure HeaderFileExtensions and ImplementationFileExtensions
do not overlap.

It'll be good idea to check that HeaderFileExtensions and ImplementationFileExtensions do not overlap.

Good idea, done!

Since we are at it - are we happy with the name "ImplementationFileExtensions"? I think "SourceFileExtensions" is more common in the literature, and it would look more visually pleasing in the config file since it has the same length as "HeaderFileExtensions".

HeaderFileExtensions: ["h", "hh"]
ImplementationFileExtensions: ["c", "cc"]

vs

HeaderFileExtensions: ["h", "hh"]
SourceFileExtensions: ["c", "cc"]

Yes, there's a clear preference in the repo:

$ git grep -E [sS]ource.?[fF]ile | wc -l
5932
$ git grep -E [iI]mplementation.?[fF]ile | wc -l
104

Will change accordingly.

On the other hand, the LLVM coding standards talk about "implementation file". Ok, will leave as is. Then I'd say it's ready for review! :)

Rebase and fix formatting of "clang-tidy" in the docs.

@Eugene.Zelenko @njames93 I believe I have addressed all the outstanding comments, is there anything else you think needs fix? Otherwise I would like to land the patch before January 24th when they will start a new release.

Eugene.Zelenko accepted this revision.Jan 14 2023, 6:54 AM

Looks OK for me, but please improve Release Notes. Will be good idea to wait for developers approval.

clang-tools-extra/docs/ReleaseNotes.rst
109

Will be good idea to combine two sentences into one because options are closely related.

This revision is now accepted and ready to land.Jan 14 2023, 6:54 AM

Merge release notes into one line

carlosgalvezp marked an inline comment as done.Jan 14 2023, 8:39 AM

On the topic of Objective-C, they use .m and .M extensions, it may be worth pointing out that this list is case sensitive (I'm assuming that's going to be the case.)

clang-tools-extra/clang-tidy/ClangTidyOptions.cpp
151–152

clang-tidy also works on ObjectiveC, ObjectiveC++ and cuda source files, We shouldn't be ignoring those in the defaults here.
Also some libraries(including the standard library) make use of extension-less header files, maybe its worth adding an empty entry in the header file extensions.

carlosgalvezp added inline comments.Jan 14 2023, 10:21 AM
clang-tools-extra/clang-tidy/ClangTidyOptions.cpp
151–152

Fully agree, but I'd like to keep the scope of this patch focused only on moving the options from local to global options - it should be a NFC. I took the defaults that we have today in all checks, which also happen to be the hardcoded defaults in FileExtensionUtils.h:

inline StringRef defaultHeaderFileExtensions() { return ";h;hh;hpp;hxx"; }

Adding more extensions requires changes to those source files as well, increase sizes of SmallVectors therein, etc, so I think it'd fit better as a separate patch. Would you agree?

By the way, does the initial semicolon above mean that we consider extension-less headers? I can fix that in this patch.

Add missing empty header extension to the default list.

Remove buggy empty extension for ImplementationFilesExtension

@njames93 Let me know if you are happy with the patch, I'd like to land before 24th of January :)

D142123 also cries for it :-) Somehow I was not able to make comments there :-(

D142123 also cries for it :-) Somehow I was not able to make comments there :-(

Yep, one more reason to get this in asap :)