Do deduplication when inserting missing headers when diags are not self contained.
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Thanks! @kadircet should make the call on these features (they look good to me), I have some implementation notes.
This is making two independent changes, one is a policy change that should be applied to the include-cleaner library, and one is a bugfix to the clang-tidy check. I think they should be separate patches.
clang-tools-extra/clang-tidy/misc/IncludeCleanerCheck.cpp | ||
---|---|---|
169 | If we want this policy, it should be provided at the include-cleaner library level so it applies to all clients, not just the clang-tidy check. I think the clearest way is to encapsulate this in include_cleaner::Includes, so that match() never matches a Header requirement against a duplicative Include. This will naturally lead to the directives being marked as unused. (In practice, the result will work in clangd, and will respect configuration of headers where analysis should be ignored, keep directives, etc) | |
227 | This means that if you're looking at individual diagnostics rather than applying all fixes to a file, only the first diagnostic will have any fix available at all. I believe the preferred solution is to do this conditionally based on areDiagsSelfContained(). clang_tidy::utils::IncludeInserter encapsulates this, but isn't used here. I don't know whether we would want to use it here, or how carefully it's already been considered. (It definitely contains a lot of logic that is dubious to apply when the include to insert has already been precisely calculated, but also some things that would be helpful). Will defer to Kadir on that. |
- Use clang::tidy::utils::IncludeInserter to avoid repeated insertion;
- Leave out the deduplication policy part.
Thanks a lot for the patch @danix800 !
I initially was rather focused on behavior of this check in workflows that require seeing "self-contained-diags", but also I rather see the bulk-apply use cases as always requiring clang-format afterwards. Hence didn't really try to polish that use case a lot, but I believe changes in this patch improve those use cases reasonably. But users do still need to run clang-format afterwards, e.g. if we first generate a finding that inserts "b.h" and then "a.h", they'll be in wrong order. So this is only fixing some cases. If you'd like to work on a more complete approach, we can prepare all the edits in a single tooling::Replacements and run clang::format::cleanupAroundReplacements on top of it to generate proper edits, and emit FixIts with those merged replacements. Note that this still won't be enough in all cases, as there can be other checks that emit fixes that are touching includes :/
Regarding usage of IncludeInserter; we were deliberately not using IncludeInserter here, as it has slightly different semantics to HeaderIncludes, which is used by all the other applications of include-cleaner when producing edits. That way it's easier for us to reason about the behavior in various places (and also to fix them). But moreover, HeaderIncludes uses clang-format config to figure out include-styles and works with a bigger set of projects without requiring extra configurations (hence this patch will actually be a regression for those). Therefore can you keep using HeaderIncludes, while skipping generation of duplicate fixits when we're in non-self-contained-diags mode (assuming you don't want to generalize the approach as I explained above).
Thanks for the background!
If we have further plans from upstream on these checker/clang-format related logic, I can wait and abandon this revision.
Or if such issue does need fixing then coud you take a look at the previous diff 1
which uses an internal set to prevent this issue?
Not at the moment, so please don't!
Or if such issue does need fixing then coud you take a look at the previous diff 1
which uses an internal set to prevent this issue?
Yes, I think that approach makes sense. But we should make it conditional on areDiagsSelfContained (basically de-duplicate only when diags are not self contained).
- Revert to internal set (not using IncludeCleaner);
- Only do deduplication when not areDiagsSelfContained().
If we want this policy, it should be provided at the include-cleaner library level so it applies to all clients, not just the clang-tidy check.
I think the clearest way is to encapsulate this in include_cleaner::Includes, so that match() never matches a Header requirement against a duplicative Include. This will naturally lead to the directives being marked as unused.
(In practice, the result will work in clangd, and will respect configuration of headers where analysis should be ignored, keep directives, etc)