This is an archive of the discontinued LLVM Phabricator instance.

[include-cleaner] Fix the member-expr-access usage for sugar type.
ClosedPublic

Authored by hokein on Dec 15 2022, 5:19 AM.

Diff Detail

Event Timeline

hokein created this revision.Dec 15 2022, 5:19 AM
Herald added a project: Restricted Project. · View Herald TranscriptDec 15 2022, 5:19 AM
Herald added a subscriber: kadircet. · View Herald Transcript
hokein requested review of this revision.Dec 15 2022, 5:19 AM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptDec 15 2022, 5:19 AM
sammccall added inline comments.Dec 15 2022, 6:54 AM
clang-tools-extra/include-cleaner/lib/WalkAST.cpp
41

Resolving a template type to only its template name (ignoring template args) seems dubious in general. Foo is an important part of vector<Foo> and a critical part of unique_ptr<Foo>.

Currently we're only using this for the base types of MemberExpr, and here what we really care about is: which decl is responsible for this member? For this, the template (as opposed to args) is a reasonable heuristic. The exceptions I can think of:

  • dependent base: template <class T> class Wrapper : public T {}; - this is rare I think
  • type aliases: template <class T> using raw_pointer = T*;
  • smart pointers (non-exception): unique_ptr<T>: the base of the member expr ends up being T* at least in non-dependent cases, so we're fine

So I'd suggest:

  • renaming this function to getMemberProvider(QualType Base) or so
  • adding a comment explicitly stating that the template name rather than params generally provide the members
71–72

this comment is confusing: it suggests that reporting a usage preverts inserting a header!
Would be nice to clarify:

  • we report a usage so that code returnFoo().bar can keep #include "foo.h"
  • reporting the member decl would cause problems (inheritance, aliases)
80

only tangentially related, but should these be implicit references?

I don't think we actually want to insert headers based on them, right? Just allow people to keep the ones that they have inserted that are required for compilation?

This would also make it less severe if our heuristic above goes wrong.

hokein updated this revision to Diff 483453.Dec 16 2022, 1:21 AM

address comments

hokein marked 2 inline comments as done.Dec 16 2022, 1:22 AM
hokein added inline comments.
clang-tools-extra/include-cleaner/lib/WalkAST.cpp
80

I don't think we actually want to insert headers based on them, right?

I think no? We want to insert headers, below is the motivated case

// all.h
#include "foo.h"
#include "declare.h"

// declare.h
class Foo;
Foo& getFoo();

// in main.cc
#include "all.h"
void k() {
   getFoo().member;
}

After the cleanup, we expect: all.h is removed, declare.h, and foo.h are inserted, right? I think this is the case we should care about, it is a simple version of protobuf case.

sammccall added inline comments.Dec 16 2022, 3:00 AM
clang-tools-extra/include-cleaner/lib/WalkAST.cpp
80

I would expect declare.h to be removed and foo.h to be inserted, and then the user to have to insert Foo.h.

The problem with inserting "foo.h" here is that we'll do it even if declare.h includes foo.h, which is a false positive according to the IWYS principle.

I think this is the case we should care about, it is a simple version of protobuf case.

I think the plan was to have only limited support for forward-declarations (because we can't reason about whether a full declaration is needed), and consider special-casing protos later because we do understand in that case.

Mostly comment nits and was ready to approve, but I think I found a bug (getAs)

clang-tools-extra/include-cleaner/lib/WalkAST.cpp
46

hmm, isn't getAs<> wrong and dyn_cast right here?

e.g. if we have a UsingType wrapping a TypedefType, we'll return the typedef rather than the using (because we check for typedef first, and getAs<> will unwrap)

50

nit: "A heuristic to" => "A heuristic: "

(otherwise it sounds like the *aim* here that we're approximating is to resolve to only the template name, rather than resolving to the template name being the approximation)

51

use -> using
critial -> critical

52

the template name -> the template

72

will => would (subjunctive makes it clearer this is *not* what we're doing)

cause issues => say what the problems are: "(e.g. force including the base class for inherited members)"

(sorry for unclear previous comment)

hokein updated this revision to Diff 483493.Dec 16 2022, 4:42 AM
hokein marked 3 inline comments as done.

address comments.

clang-tools-extra/include-cleaner/lib/WalkAST.cpp
46

oh, right, good catch! Added a test.

I used getAs is to unwrap the outer sugar ElaboratedType for Base, I think we have to do it manually here.

ElaboratedType 0x56313be324a0 'Foo' sugar
`-UsingType 0x56313be32470 'ns::Foo' sugar
  |-UsingShadow 0x56313be32410 'Foo'
  `-RecordType 0x56313be32200 'struct ns::Foo'
    `-CXXRecord 0x56313be32170 'Foo'
80

I would expect declare.h to be removed and foo.h to be inserted, and then the user to have to insert Foo.h.

I think by "foo.h to be inserted" you mean declare.h, right? There is a problem as well, we break the code after the cleanup :( (I'd image this is a common pattern used in LLVM)

Thanks I see the point now. (proto is just a special case rather than doing it in generally). I added a FIXME here, and plan to fix it in a follow-up patch (this is an existing issue).

clang/include/clang/AST/Type.h
2595

with the dyn_cast, we don't need this template specialization for UsingType now. But I think it is useful to keep it (I have spent sometime on debugging out why getAs<Typedef> work but getAs<UsingType> not).

sammccall accepted this revision.Dec 16 2022, 5:37 AM

LG, thanks!

clang-tools-extra/include-cleaner/lib/WalkAST.cpp
45

I think the relative order of unwrapping pointer and elaboratedtype is correct, but I also think we're going to end up adding more ignored sugar here and reasoning about the order is hard.

I think return getMemberProvider(Base->getPointeeType()) (and the same for ElaboratedType) is a more robust pattern that will be easier to extend.

46

Yep, unfortunately I think we have to be explicit about what kind of sugar we want to unwrap vs use.

For written types (i.e. typelocs) we get this for free, as basically want to unwrap if there's another typeloc (lexically) contained. But I don't know if we can easily reuse that machinery.

80

I think by "foo.h to be inserted" you mean declare.h, right

Sorry, yes I meant all.h removed, declare.h inserted. (Not at all what I wrote)

Yes we break the code. The condition for doing so is:
a) the current file is not IWYU-clean to start with
b) the codebase relies on forward-declarations
c) forward-declarations aren't always sufficient

So this is bad, but I'm not sure actually disastrous.

I added a FIXME here, and plan to fix it in a follow-up patch (this is an existing issue).

thanks!

This revision is now accepted and ready to land.Dec 16 2022, 5:37 AM
This revision was automatically updated to reflect the committed changes.
hokein marked an inline comment as done.