This is an archive of the discontinued LLVM Phabricator instance.

[Includecleaner] Introduce RefType to ast walking
ClosedPublic

Authored by kadircet on Oct 13 2022, 1:44 AM.

Details

Summary

RefTypes are distinct categories for each reference to a symbol. They
are signals indicating strength of a reference, that'll enable different
decision making based on the finding being provided.

There are 3 kinds of ref types:

  • Explicit, the reference is spelled in the code.
  • Implicit, the reference is not directly visible in the code.
  • Related, the reference exists but can't be proven as used (e.g. overloads brought in by a using decl but not used by the code).

Diff Detail

Event Timeline

kadircet created this revision.Oct 13 2022, 1:44 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 13 2022, 1:44 AM
Herald added a subscriber: mgrang. · View Herald Transcript
kadircet requested review of this revision.Oct 13 2022, 1:44 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 13 2022, 1:44 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript
sammccall added inline comments.Oct 13 2022, 10:40 PM
clang-tools-extra/include-cleaner/lib/WalkAST.cpp
46–47

using default arguments here would reduce the noise level a lot

77

This doesn't correspond to the example given for RefKind: the docs suggest they are *unused* while here we don't know if they're used.

At least for this case it seems calling the RefKind Ambiguous would also be clearer.

clang-tools-extra/include-cleaner/unittests/WalkASTTest.cpp
120–123

This is very noisy. please find a way to avoid having to mark all test cases as $explicit, e.g. by splitting implicit cases into different tests

hokein added inline comments.Oct 14 2022, 12:20 AM
clang-tools-extra/include-cleaner/lib/AnalysisInternal.h
35

I'm wondering what's our plan of supporting policy (different coding-style may have different decisions about which includes are used)?

IIUC, the RefType is part of the picture, providing fine-grained information about each reference, and the caller can make their decisions based on it?

Thinking about the Implicit type, I think cases like non-spelled constructor call, implicit conversion calls (maybe more?) fall into this type, if we support Policy.Constructor, and Policy.Conversion, how do we distinguish with these cases? We can probably do some ad-hoc checks on the TargetDecl, but I'm not sure that the tuple <SourceLocation, TargetDecl, Ref> will provide enough information to implement different policy .

38

nit: default constructor call is vague -- S s = S(); it is a default constructor call, but it is not implicit, maybe more specific by giving a simple example, like default constructor call in S s;?

clang-tools-extra/include-cleaner/unittests/WalkASTTest.cpp
21

nit: it seems unused.

197–200

maybe add this following case, I think it will mark the the constructor S() implicit.

struct S { S(); };
S ^t;
198

nit: also add an implicit case

S t = 42;
kadircet updated this revision to Diff 470558.Oct 25 2022, 10:37 AM
kadircet marked 5 inline comments as done.
  • Rename Related to Ambigious in RefKind
  • Add more tests for implicit constructor calls
  • Default to Explicit as RefKind when reporting references
  • Don't report unused overloads, unless reference is inside a header
clang-tools-extra/include-cleaner/unittests/WalkASTTest.cpp
120–123

what about treating non-named points as explicit? that way we would only annotate non-explicit references (which are rare).

kadircet updated this revision to Diff 470560.Oct 25 2022, 10:48 AM
kadircet marked an inline comment as done.
  • Clarify default constructor call example in RefType::Implicit
clang-tools-extra/include-cleaner/lib/AnalysisInternal.h
35

IIUC, the RefType is part of the picture, providing fine-grained information about each reference, and the caller can make their decisions based on it?

Exactly that's the idea.

if we support Policy.Constructor, and Policy.Conversion, how do we distinguish with these cases? We can probably do some ad-hoc checks on the TargetDecl

In our previous discussions we've also come to the conclusion that TargetDecl will have enough information for the caller to infer such information later on. We've decided to not make it part of the public api (at least initially) because it's unclear how widely useful those signals will be. But if it turns out to be needed by a variety of applications the design is extensible to either provide a:

  • inferDetails(Symbol) helper, which would derive a signal structure that extracts most of those signals needed, or
  • update Callback signature to also carry that information derived from the Symbol

Does that make sense?

sammccall added inline comments.Oct 25 2022, 11:20 AM
clang-tools-extra/include-cleaner/include/clang-include-cleaner/Analysis.h
25

as discussed I moved the other vocab types into types.h

27

named by the reference? or am I missing a case

32

Ambiguous

clang-tools-extra/include-cleaner/lib/AnalysisInternal.h
35

In our previous discussions we've also come to the conclusion that TargetDecl will have enough information for the caller to infer such information later on

It was a while ago and my memory is bad, but this isn't my recollection (maybe you're talking about different discussions). I think it was rather "this is the bare minimum we need to get going, and we want to punt on real policy support for now".

At least "TargetDecl will have enough information" seems implausible on its face, e.g. how could you implement "types of variables are used, but types of expressions are not used" based on inspecting TargetDecl?

clang-tools-extra/include-cleaner/lib/WalkAST.cpp
155–162

(this is the first assumption in this file that we're traversing the main file. Is it worth avoiding this by checking whether the root is written in the main file?)

clang-tools-extra/include-cleaner/unittests/WalkASTTest.cpp
51

ambiguous

51

tacking on boolean optional parameters to handle an uncommon option to a function called everywhere doesn't scale well.

What about detecting the string // header in ReferencingCode?
Or even // c++-header, thinking ahead to objc etc

120–123

Not sure implicit references are actually rare, more like our examples are trivial for now :-)

But I might be wrong, up to you.

(I would at least use $A^x instead of $ambiguous^x)

kadircet marked 6 inline comments as done.Oct 28 2022, 5:51 AM
kadircet added inline comments.
clang-tools-extra/include-cleaner/lib/AnalysisInternal.h
35

It was a while ago and my memory is bad, but this isn't my recollection (maybe you're talking about different discussions). I think it was rather "this is the bare minimum we need to get going, and we want to punt on real policy support for now".
At least "TargetDecl will have enough information" seems implausible on its face, e.g. how could you implement "types of variables are used, but types of expressions are not used" based on inspecting TargetDecl?

Right. I wasn't talking about inferring signals about how a decl was found (e.g. type of an expression, as you've also mentioned), but rather talking about information about the decl itself (is this a member, constructor etc.).

clang-tools-extra/include-cleaner/unittests/WalkASTTest.cpp
120–123

i don't like the idea of splitting tests because i feel like that'll lead to duplicating lots of test cases when we have a mixture of references happening (e.g. implicit type conversion and an explicitly spelled type from same reference).

so another option i can think of here is returning annotations for points from testWalk and building expectations on top of that, e.g:

EXPECT_REFERENCES("void ^foo();", "void bar() { ^foo(); }", {"explicit"}); // checks that points in target have respective reference types
EXPECT_EXPLICIT_REFERENCES("void ^foo();", "void bar() { ^foo(); }"); // shorthand representation for making sure all the references are with a single annotation.

WDYT?

kadircet updated this revision to Diff 471512.Oct 28 2022, 5:51 AM
  • Fix typo in Ambiguous
  • Rebase for Types.h, introduce SymbolReference
  • Use comments rather than an optional parameter to indicate header-ness of main files in tests
  • Rather than assuming Root is in main file, check that either root isn't in main file or main file is a header.
hokein accepted this revision.Nov 3 2022, 2:44 AM

The current patch looks good from my side. There is a remaining discussion around the verbosity of unittests, but I think we should not be blocked by that.

clang-tools-extra/include-cleaner/include/clang-include-cleaner/Types.h
112

nit: locaition => location.

116

what's is the edge? I understand it in some degree, but I'd avoid introducing new concept as it might lead to some confusions (and we haven't defined a graph anywhere in the code).

clang-tools-extra/include-cleaner/lib/AnalysisInternal.h
24

this is out-of-date, I think, the RefType is in Types.h.

This revision is now accepted and ready to land.Nov 3 2022, 2:44 AM
kadircet updated this revision to Diff 472925.Nov 3 2022, 6:54 AM
kadircet marked 5 inline comments as done.
  • Fix typo
  • Change comment
  • Use Types.h
tschuett added inline comments.
clang-tools-extra/include-cleaner/lib/WalkAST.cpp
25

There is a cuter way to use anonymous namespaces:
https://llvm.org/docs/CodingStandards.html#anonymous-namespaces

This revision was automatically updated to reflect the committed changes.
kadircet added inline comments.Nov 8 2022, 7:34 AM
clang-tools-extra/include-cleaner/lib/WalkAST.cpp
25

sorry missed that one, were you referring to definitions of methods? because AFAICT there isn't anything else inside this anon namespace that i can declare static in a named-namespace instead.
if that's the case, i don't think it's actually common in the codebase to define your methods out-of-line when the whole class is an implementation detail.

No worries. The style asks to move the definitions out of the anonymous namespace, but ....

maryammo added a subscriber: maryammo.EditedNov 9 2022, 8:48 AM

@kadircet, this commit causes failure on https://lab.llvm.org/buildbot/#/builders/121 which is possible to reproduce locally, can you please take a look?

[35/111] Linking CXX executable tools/clang/tools/extra/include-cleaner/unittests/ClangIncludeCleanerTests
FAILED: tools/clang/tools/extra/include-cleaner/unittests/ClangIncludeCleanerTests 
: && /usr/lib64/ccache/c++  -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror=date-time -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wno-missing-field-initializers -pedantic -Wno-long-long -Wimplicit-fallthrough -Wno-maybe-uninitialized -Wno-class-memaccess -Wno-noexcept-type -Wdelete-non-virtual-dtor -Wno-comment -Wno-misleading-indentation -fdiagnostics-color -ffunction-sections -fdata-sections -fno-common -Woverloaded-virtual -fno-strict-aliasing -O3 -DNDEBUG  -Wl,--gc-sections tools/clang/tools/extra/include-cleaner/unittests/CMakeFiles/ClangIncludeCleanerTests.dir/AnalysisTest.cpp.o tools/clang/tools/extra/include-cleaner/unittests/CMakeFiles/ClangIncludeCleanerTests.dir/RecordTest.cpp.o tools/clang/tools/extra/include-cleaner/unittests/CMakeFiles/ClangIncludeCleanerTests.dir/WalkASTTest.cpp.o  -o tools/clang/tools/extra/include-cleaner/unittests/ClangIncludeCleanerTests  -Wl,-rpath,/home/buildbots/ppc64le-clang-multistage-test/clang-ppc64le-multistage/stage1/lib  -lpthread  lib/libllvm_gtest_main.so.16git  -lpthread  lib/libclangIncludeCleaner.so.16git  lib/libclangTesting.so.16git  lib/libLLVMTestingSupport.so.16git  lib/libclangToolingInclusionsStdlib.so.16git  lib/libllvm_gtest.so.16git  lib/libclangFrontend.so.16git  lib/libclangAST.so.16git  lib/libclangBasic.so.16git  lib/libLLVMSupport.so.16git  -Wl,-rpath-link,/home/buildbots/ppc64le-clang-multistage-test/clang-ppc64le-multistage/stage1/lib && :
/usr/bin/ld: tools/clang/tools/extra/include-cleaner/unittests/CMakeFiles/ClangIncludeCleanerTests.dir/RecordTest.cpp.o: undefined reference to symbol '_ZTVN5clang18PPChainedCallbacksE'
/home/buildbots/ppc64le-clang-multistage-test/clang-ppc64le-multistage/stage1/lib/libclangLex.so.16git: error adding symbols: DSO missing from command line
collect2: error: ld returned 1 exit status
[79/111] cd /home/buildbots/ppc64le-clang-multistage-test/clang-ppc64le-multistage/llvm/clang/bindings/python && /usr/local/bin/cmake -E env CLANG_LIBRARY_PATH=/home/buildbots/ppc64le-clang-multistage-test/clang-ppc64le-multistage/stage1/lib /usr/bin/python3.6 -m unittest discover
..............


cmake -G Ninja ../llvm-project/llvm -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_ASSERTIONS=True '-DLLVM_LIT_ARGS='"'"'-v'"'"'' -DCMAKE_INSTALL_PREFIX=../stage1.install -DLLVM_ENABLE_ASSERTIONS=ON -DLLVM_ENABLE_RUNTIMES=compiler-rt -DBUILD_SHARED_LIBS=ON -DLLVM_CCACHE_BUILD=ON '-DLLVM_ENABLE_PROJECTS=clang;clang-tools-extra;llvm'

...