This is an archive of the discontinued LLVM Phabricator instance.

[IncludeCleaner] Introduce decl to location mapping
ClosedPublic

Authored by kadircet on Oct 14 2022, 1:54 AM.

Details

Summary

Creates a one to many mapping, by returning all the possible locations
providing a decl. Also includes an "is definition" signal for the
location, that can be used for ranking afterwards.

This also takes care of stdlib symbols by having a variant of locations.

Depends on D135859.

Diff Detail

Event Timeline

kadircet created this revision.Oct 14 2022, 1:54 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 14 2022, 1:54 AM
Herald added a subscriber: mgrang. · View Herald Transcript
kadircet requested review of this revision.Oct 14 2022, 1:54 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 14 2022, 1:54 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript
sammccall added inline comments.Oct 14 2022, 2:53 AM
clang-tools-extra/include-cleaner/lib/AnalysisInternal.h
54

This is an important public API concept ==> It should be documented and part of a public header, I think

clang-tools-extra/include-cleaner/lib/WalkAST.cpp
167 ↗(On Diff #467715)

I don't think "definition" is the right concept here, the better signal in experiments was "is this declaration usable in general", which a function forward declaration is.
(In the prototype this was IsComplete and only set for template/class cases where completeness matters, InIncomplete is probably a bit cleaner.)

Also it would be nice to avoid passing bools around to represent subtle concepts (which definition is).

tschuett added inline comments.Oct 16 2022, 11:47 AM
clang-tools-extra/include-cleaner/lib/AnalysisInternal.h
54

IIRC, the supported Apple Clang does not support std::variant yet.

We support Apple Clang 9.3, but std:variant ships with Apple Clang 10.0. :-(

We support Apple Clang 9.3, but std:variant ships with Apple Clang 10.0. :-(

Hi @tschuett, there are other pieces of LLVM that use <variant> today already, some big components include TableGen, DenseMap, CodeGen, flang, pseudo, MLIR. so i guess LLVM is already non-compilable today on apple clang 9.3 and someone needs to do a cleanup (i am happy to drop the usage here and in my other patches, but it won't prevent others from introducing more usage, so I don't see how useful my changes here would be)

I've brought this up in https://discourse.llvm.org/t/important-new-toolchain-requirements-to-build-llvm-will-most-likely-be-landing-within-a-week-prepare-your-buildbots/61447/12?u=kadircet. Please pile up on that thread if you're a maintainer for that particular toolchain, rather than just a concerned citizen.

kadircet updated this revision to Diff 476091.Nov 17 2022, 4:39 AM
  • Rebase
  • Move to new file
  • Handle macros
  • Change from boolean output to enum for signalling
sammccall added inline comments.Nov 17 2022, 8:27 AM
clang-tools-extra/include-cleaner/lib/AnalysisInternal.h
85

can you explain the design of hints a bit?

In particular:

  • are hints emitted by every mapping layer (node -> symbol -> location -> header -> include)? if so, are they the same kind (i.e. same enum?)
  • are the hints from one stage provided to subsequent stages?
  • node -> symbol mapping already produces hints in the form of RefKind, do we reconcile these concepts somehow?
  • do hints for the same entity found on different paths get merged, or will we ultimately report a list of vector<HintedHeader> where the same header may repeat with different hints?
87

Is this intended to be:

  • a bit? needs to be nonzero
  • a set of bits? should be None or so as it does not set any bits
  • something else? I don't understand
89

You're only setting this in cases where incompleteness is possible and significant. e.g. for AliasDecls it is not set even though they are (shallowly) complete.

I think the most self-documenting solution is to invert it to Incomplete.
Otherwise the docs here should explain when it is set vs not set.

96

Having been down this path, I think using a pair for this is too clumsy for such an important data type (and this may be repeated for each mapping layer).

I'd suggest a template that can attach hints to another type, and acts as a smart pointer to that type: Loc->kind(), Loc.hint() rather than Loc.first.kind(), Loc.second, can define a decent print function, etc.

clang-tools-extra/include-cleaner/lib/LocateSymbol.cpp
21

Hmm, this isn't list (e.g. ObjCCategoryDecl), is hard to complete, and we don't actually have to complete it to get the correct behavior. This suggests it's not the right primitive, e.g. it's hard to look at the code and see whether there's a bug.

One option would be to make this function isIncomplete, call it on each redecl, and use isThisDeclarationADefinition.
But we can also avoid the chain of dyn_casts for each redecl by making this function something like Optional<Decl*> getCompletingDefinition() where None means any redecl is complete (usable without definition) and nullptr means no redecl is complete (definition needed but not found), and a pointer means that redecl is complete and no other is.

25

the net result of include VarDecl and FunctionDecl here is that we prefer headers defining these things over headers declaring them. How sure are we that this is a good heuristic?

I thought the idea here was "prefer declarations that are complete enough to use", which is basically for classes & templates, but not functions and variables.

If it's rather "definitions are more likely to be the thing we want" then the hint should probably be named Definition rather than Complete.

40

I think the same logic that says we include a header that is // IWYU pragma private says we should, probably with some Private hint.

42

nit: I think {{...}} is less confusing for both compilers and humans if you spell the inner type

clang-tools-extra/include-cleaner/lib/WalkAST.cpp
20 ↗(On Diff #476091)

unused? and the other two

clang-tools-extra/include-cleaner/unittests/LocateSymbolTest.cpp
37

this is a heavy helper that embeds the assertions inside itself, which is a pattern that has gotten us into trouble in the past.

locateSymbol has a pretty simple signature and seems possible to test more directly with something like:

LocateExample X("struct $decl^foo; struct $def^foo {};");
EXPECT_THAT(
    locateSymbol(X.findDecl("decl", "foo")),
    ElementsAre(
        Pair(X.loc("decl"), 0),
        Pair(X.loc("def"), Hint::Complete));

it's a little longer, but also more flexible (tests hints without hacks, can test standard library, can test macros), is attached to the right line, etc

72

def seems like a strange annotation for Complete - again definition/complete should be different concepts or have the same name

92

this doesn't actually test that it does the right thing: it would pass if it returns nothing

93

should we have at least a simple testcase for macros?

clang-tools-extra/include-cleaner/unittests/WalkASTTest.cpp
17 ↗(On Diff #476091)

I can't see where twine/variant are used, is a tool suggesting these?

kadircet updated this revision to Diff 478611.Nov 29 2022, 8:21 AM
kadircet marked 6 inline comments as done.
  • Leaving out all the pieces around signals as discussed.
  • Update tests to use a LocateExample helper, have a test for macros.
  • Get rid of residues in other test files.
sammccall accepted this revision.Nov 30 2022, 9:09 AM
sammccall added inline comments.
clang-tools-extra/include-cleaner/lib/AnalysisInternal.h
29

unused?

(flagging these in case these are tool-assisted but the tool is buggy)

30

unused?

clang-tools-extra/include-cleaner/unittests/LocateSymbolTest.cpp
53

or maybe clearer return Inputs, and you're calling AST(Inputs) as the member initializer

76

nit: you have an assertion here but not in findMacro

113

<< Case, or SCOPED_TRACE, or something so we can see what test it is

This revision is now accepted and ready to land.Nov 30 2022, 9:09 AM
This revision was automatically updated to reflect the committed changes.
kadircet marked 6 inline comments as done.