This is an archive of the discontinued LLVM Phabricator instance.

[include-cleaner] Loose matching for verbatim headers
ClosedPublic

Authored by sammccall on Jul 20 2023, 5:23 AM.

Details

Summary

A verbatim header usually corresponds to a symbol from a header with
a pragma "IWYU pragma: private, include <foo.h>".

Currently this is only satisfied if the main file contains exactly

#include <foo.h>

In practice this is too strict, we also want to allow

#include "path/to/foo.h"

so long as they resolve to the same file.

We cannot be 100% sure without doing IO, and we're not willing to do
that, but we can detect the common cases based on paths.

Diff Detail

Event Timeline

sammccall created this revision.Jul 20 2023, 5:23 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 20 2023, 5:23 AM
sammccall requested review of this revision.Jul 20 2023, 5:23 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 20 2023, 5:23 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript
sammccall updated this revision to Diff 542444.Jul 20 2023, 5:24 AM

clean up some leftovers

(ping - also happy to send this to someone else if you prefer!)

kadircet added inline comments.Jul 26 2023, 12:52 AM
clang-tools-extra/include-cleaner/include/clang-include-cleaner/Types.h
207

maybe add a comment saying that these spellings are generated heuristically?

clang-tools-extra/include-cleaner/lib/Types.cpp
110–112

nit: return P.rtrim('/'); // only separator we can encounter is forward slash at this point.

129

nit: prefer early exit

137

i think this assumption breaks when we have an absolute include search path, which is a subdirectory of the CWD of file manager. as we might now have a resolved include which is relative, but a search path that'll never match a prefix of those includes.

i guess this is fine, as we're trying to heuristically match. but it would be useful to mention it here and add a unittest for that. (unless I am missing something and headersearch already normalizes such paths)

141

nit: prefer early exit

142–144

nit: auto Rel = llvm::StringRef(Path).drop_front(Parent.size()).ltrim('/'); // we already normalized the path and only have forward slashes.

171–173

i think we shouldn't look into alternate spellings if we've already found an exact spelling. we would be matching wrong headers for sure in those cases.

sammccall updated this revision to Diff 544323.Jul 26 2023, 5:36 AM
sammccall marked 6 inline comments as done.

address review comments

clang-tools-extra/include-cleaner/lib/Types.cpp
110–112

Oops, yes!
Kept the loop though to avoid making an extra copy, APIs are awkward

137

This is part of the contract of the function, pulled in the doc comment from D155878 and added a test for the cases that don't work.

171–173

I had it this way initially and changed my mind. Having the presence of some includes affect whether others match at all breaks an "obvious" part of the contract. (So obvious that I assumed it in the test ten minutes after deliberately breaking it). I wouldn't add a special case here unless there's some observed bad-results motivation for it.

we would be matching wrong headers for sure in those cases.

You could be including the same header under two paths (and in fact that's what this testcase is doing). I don't know why you would, but we also don't have real examples of this happening with *different* headers.

kadircet accepted this revision.Jul 27 2023, 3:01 AM
kadircet added inline comments.
clang-tools-extra/include-cleaner/lib/Types.cpp
171–173

You could be including the same header under two paths (and in fact that's what this testcase is doing). I don't know why you would

I do think it's better to diagnose one of those as unused in such a scenario. As tools like clang-format won't be able to de-duplicate those automatically.

but you're right about not having examples for alternate spellings resolving to different files, so I guess this is at least "safe" and we can change the behaviour if we encounter examples.

This revision is now accepted and ready to land.Jul 27 2023, 3:01 AM
This revision was automatically updated to reflect the committed changes.