Page MenuHomePhabricator

[clangd] IncludeCleaner: Be more conservative in marking RecordDecl usage
Changes PlannedPublic

Authored by kbobyrev on Oct 28 2021, 5:05 AM.

Details

Reviewers
kadircet
Summary

For expression with RecordDecl type, we only need its definition in case it's
available. And when it isn't, there's a high chance the forward decl is in the
same file, so take the most recent decl.

Diff Detail

Event Timeline

kbobyrev created this revision.Oct 28 2021, 5:05 AM
kbobyrev requested review of this revision.Oct 28 2021, 5:05 AM
kadircet added inline comments.Oct 28 2021, 6:37 AM
clang-tools-extra/clangd/IncludeCleaner.cpp
51

i think this might turn a direct dependency into a transitive one, e.g. you got forward declarations of struct Foo; in a.h and b.h, then c.h includes b.h. In the main file you might have includes for a.h and c.h. Now the most recent redecl happens through c.h hence a.h will be marked as unused, even though it's the one header providing the forward decl directly.

what about just rolling with the definition when it's visible and handling the forward-decl in main file case inside the add ? i suppose that's something we'd want for all decls and not just records? it implies passing in main file id and a sourcemanager into the crawler, and inside the add before going over all redecls, we just check if most recent decl falls into the main file.

What's the purpose of this patch at a high level? Was it triggered by a real example?
IIUC it's avoiding false negatives, where we're using a class and an otherwise-unused header forward-declares that class.
Avoiding false negatives isn't a high priority at this point, unless it's a *really* common case.
As Kadir says this is subtle and risks introducing false positives.

My inclination is that we shouldn't spend time making to make these heuristics more precise/complicated right now, but I'm willing to be convinced...

What's the purpose of this patch at a high level? Was it triggered by a real example?
IIUC it's avoiding false negatives, where we're using a class and an otherwise-unused header forward-declares that class.
Avoiding false negatives isn't a high priority at this point, unless it's a *really* common case.
As Kadir says this is subtle and risks introducing false positives.

My inclination is that we shouldn't spend time making to make these heuristics more precise/complicated right now, but I'm willing to be convinced...

LLVM has tons of widely used headers with lots and lots of fowrard-declared classes (mainly AST ones) which result in less-useful unused includes warnings. If I drop the change for the case when definition is not available (or fix by checking whether the last redecl is in the mainfile) then this seems like a clear improvement with no false-positives, WDYT?

clang-tools-extra/clangd/IncludeCleaner.cpp
51

I was considering that part but I decided it's probably more complications for less benefits but I can see how the false positives might turn out to be a problem.

I think this is not something we want for all decls for type checking reasons (e.g. functions?). @sammccall and I talked about similar things and I believe this is the conclusion, isn't it?

Thank you, will do!

To summarize what I remember from the offline discussion:

  • when we have multiple decls, eliminating as many as possible will reduce false negatives.
  • reducing false negatives isn't our top priority, so we should focus only on the most impactful/safe cases
  • if we see a decl in the main file, we can skip other non-definition decls.
  • if we see a record definition anywhere, we can skip all other decls. This can have false positives (opaque types whose definition is incidentally visible) but we think it's hugely impactful. We should be clear that we *only* plan to do this for plain classes, and why.

The last two points could be separate patches, up to you.

Some exceptions that seem relevant:

  • we can't skip based on a definition (main file or class) that can't stand alone, usually because of qualifiers. e.g. void foo::bar() {}.
  • friend declarations are special. I suspect we should conservatively never eliminate based on them.
clang-tools-extra/clangd/IncludeCleaner.cpp
51

Seems like the problem Kadir described still exists if b.h is a definition, you'll mark a.h as unused even though it's the only header providing the symbol directly at all (and maybe you don't need the definition!).

I think this is not something we want for all decls for type checking reasons (e.g. functions?). @sammccall and I talked about similar things and I believe this is the conclusion, isn't it?

Sorry, I can't really understand what any of the "it"s or "this"s in this sentence refer to :-)

kbobyrev planned changes to this revision.Mon, Nov 8, 6:54 AM

Will split the patch in two, as suggested.