This is an archive of the discontinued LLVM Phabricator instance.

[clangd] A helper to find explicit references and their names
ClosedPublic

Authored by ilya-biryukov on Sep 20 2019, 4:21 AM.

Details

Summary

Allows to simplify pending code tweaks:

  • the upcoming DefineInline tweak (D66647)
  • remove using declaration (D56612)
  • qualify name under cursor (D56610)

Another potential future application is simplifying semantic highlighting.

Diff Detail

Repository
rL LLVM

Event Timeline

ilya-biryukov created this revision.Sep 20 2019, 4:21 AM
Herald added a project: Restricted Project. · View Herald TranscriptSep 20 2019, 4:22 AM

Thanks this mostly LG, but I think we need some tests feel free to copy the ones in D66647 as a start point. But I believe coverage for this one is bigger, as this tries to figure-out all references, whereas the define-inline patch only cares about references to non-local decls in contexts that can have a nested name specifier.

clang-tools-extra/clangd/FindTarget.cpp
369 ↗(On Diff #220999)

could you add some comments on what it does

373 ↗(On Diff #220999)

what is namedTarget ?

376 ↗(On Diff #220999)

can we simplify this by populating two vectors, Instantiations and Patterns, and then returning the patterns iff instantiations are empty?

381 ↗(On Diff #220999)

why do we discard aliases(specifically underlying decls) ? for example if we have vector<int> I believe decl(template<> std::vector<int>) will be marked with both TemplateInst and Underlying right ?

391 ↗(On Diff #220999)

same here for ignoring underlying decls

399 ↗(On Diff #220999)

these three(usingdirective,using,namespacealias) decls are the only ones that can have a nestednamespecifier.

Are we sure these are the only ones that can reference a decl? For example what about a ConstructorDecl with initializers referencing FieldDecls, I believe this function should rather return multiple locations in such cases.

409 ↗(On Diff #220999)

I think getUsingLoc will return the location for using keyword, not the name location. Maybe use getLocation ?

426 ↗(On Diff #220999)

again I believe this list is non-exhaustive, e.g. designatedinitilaizers. and again this also needs to return a vector instead.

432 ↗(On Diff #220999)

I believe it is better to return getFoundDecl then getDecl, the former respects using declarations.

506 ↗(On Diff #220999)

can we move this below ExplicitReferenceCollector to prevent splitting anon namespace?

548 ↗(On Diff #220999)

why not just traversenestednamespecifier and visitNode(L) instead of calling the base::traverse ?

570 ↗(On Diff #220999)

why not just visitNode(L) and traverse(prefix(L)) instead of calling base?

581 ↗(On Diff #220999)

what is nodeReference?

594 ↗(On Diff #220999)

So this endpoint doesn't seem to have downsides above(not visiting all referenced decls) as it performs traversal through an extra ASTVisitor.

What is the point of exposing the first method? Maybe we can getaway with not exposing it or strictly mentioning it might not return references in the children nodes?

clang-tools-extra/clangd/FindTarget.h
103 ↗(On Diff #220999)

drop :wa ?

kadircet added inline comments.Sep 23 2019, 6:21 AM
clang-tools-extra/clangd/FindTarget.h
115 ↗(On Diff #220999)

It might be better to accept a DynTypedNode in here as well, so that this can be used with other node types as well, or provide overrides for them(for example Decl).

kadircet added inline comments.Sep 23 2019, 6:55 AM
clang-tools-extra/clangd/FindTarget.h
93 ↗(On Diff #220999)

do we have any real use-case for this field? apart from "skipping instance members in define inline"? because that one can also be achieved easily through decl (there is a isCXXInstanceMember).

ilya-biryukov marked 18 inline comments as done.
  • Add simple tests
  • Address other comments

Added some tests too. More tests for dependent constructs are still missing, but please take a look at what we have for now.

clang-tools-extra/clangd/FindTarget.cpp
373 ↗(On Diff #220999)

Used to be called this before. Fixed, thanks.

376 ↗(On Diff #220999)

Did something very similar. PTAL.

381 ↗(On Diff #220999)

For ^vector<int>, we get vector<int>(TemplateInstantiation) and vector(TemplatePattern).
No Underlying.

Added a relevant test-case.

391 ↗(On Diff #220999)

As mentioned in the previous comment, we want to discard underlying. Underlying decl was not written in the source code, therefore should not be returned by this function.

399 ↗(On Diff #220999)

This function is non-recursive and ConstructorInitializer is a separate node under ConsturctorDecl.
Any reference to fields are owned (and reported when ConstructorInitializer is visited)

There might be more decls that I missed here, though, hopefully we'll find and add them later.

It's a bit hard to get all cases on the first try.

409 ↗(On Diff #220999)

Thanks for spotting this! This was supposed to be a temporary placeholder until I figure out a proper location.
But I forgot about it.

Fixed and added a test.

426 ↗(On Diff #220999)

Added a FIXME to add more exhaustive handling.
I'll make sure to fix it soon, but would want to first land non-exhaustive version.

I would try avoiding returning more than one element here. For designated initializers we seem to have multiple designators. Since this function is non-recursive, it should not return those.

Instead we would require our clients (i.e. our visitor) to call a separate function for each of the designators.

432 ↗(On Diff #220999)

Good point. Done. Added a test too.

548 ↗(On Diff #220999)

To avoid re-implementing the traversal logic. Base::Traverse does exactly that, we shouldn't re-implement traversal ourselves.

570 ↗(On Diff #220999)

For code reuse. See other comment too

581 ↗(On Diff #220999)

Now renamed to explicitReference. Thanks.

594 ↗(On Diff #220999)

I think we'll need it for D56610 (qualify name under cursor).
But happy to remove this from public API for now until we have a use-case for it.

Also updated a comment to explicitly mention it does not recurse into child nodes, hope its intention is clear now.

kadircet marked an inline comment as done.Sep 24 2019, 1:02 AM
kadircet added inline comments.
clang-tools-extra/clangd/FindTarget.cpp
376 ↗(On Diff #220999)

LG, thanks!

432 ↗(On Diff #220999)

I was actually referring to DeclRefExpr, change seems to be on MemberExpr

548 ↗(On Diff #220999)

I see but that should help get rid of TypeLocsToSkip and some extra traversals. I believe it would be worth getting rid of the additional state management, but up to you.

370 ↗(On Diff #221345)

nit: s/a declaration/declaration(s)/

519 ↗(On Diff #221345)

unnecessary closing and opening of anon namespace

577 ↗(On Diff #221345)

:wa is still around have:wa references :D

609 ↗(On Diff #221345)

I can see the second check is for getting rid of references coming from macro expansions. But what exactly is the first one for? How can we get an invalid nameloc, could you also add it as a comment?

clang-tools-extra/clangd/FindTarget.h
96 ↗(On Diff #221345)

Is this for testing purposes? maybe move it into findtargettests.cpp or make it a member helper like Print(SourceManager&) so that it can also print locations etc.?

clang-tools-extra/clangd/unittests/FindTargetTests.cpp
498 ↗(On Diff #221345)

this sounds more like LastOffset or PrevOffset

504 ↗(On Diff #221345)

I believe these are already filtered-out by findExplicitReferences ?

600 ↗(On Diff #221345)

Is it OK for this case to be different than X::a above?

also shouldn't this be a::b::struct S or None ?(my preference would be towards None)

638 ↗(On Diff #221345)

again qualifiers seems to be inconsistent with the rest of the cases.

ilya-biryukov marked 9 inline comments as done.Sep 24 2019, 5:47 AM
ilya-biryukov added inline comments.
clang-tools-extra/clangd/FindTarget.cpp
432 ↗(On Diff #220999)

Good point. Fixed that one too, thanks!

548 ↗(On Diff #220999)

I like the idea and actually tried to implement it before, but failed. The nested-name-specifier case is simple and we could actually implement it.

However, the ElaboratedTypeLoc case is more complicated: we have to call Base::TraverseTypeLoc to make sure we recursively visit children of ElaboratedTypeLoc::getNamedType (e.g. template arguments inside the typeloc). However, Base::TraverseTypeLoc automatically calls VisitTypeLoc and our qualifier is forgotten again.

Let me know if there's a way to implement this that I missed, though.

577 ↗(On Diff #221345)

Thanks. This slipped my attention last time :-)

609 ↗(On Diff #221345)

Ah, thanks for spotting this. I don't think we want to filter out macro references, fixed that and added a test

The first one is a precaution against various implicit nodes getting into the traversal. Normally this shouldn't happen, unless:

  • RecursiveASTVisitor has bugs and visits implicit nodes (e.g. generated begin() calls for range-based-for )
  • Users start traversal from an implicitly-generated statement.

I couldn't come up with examples of that actually happening, but I'm also scared of adding an assertion as I feel this would inevitably fail at some point.
At the same time, the whole point of this function is to only expose those references that could be meaningfully found in the source code (e.g. for refactoring and syntax highlighting), i.e. they should always have (at least) source location of the name.

Added a comment, let me know if this could be made clearer, though.

clang-tools-extra/clangd/FindTarget.h
96 ↗(On Diff #221345)

Yes, this is for output in the test. Moving to findTargetTests.cpp, we risk ODR violations in the future.
In any case, it's useful to have debug output and operator<< is common for that purpose: it enables llvm::toString and, consecutively, llvm::formatv("{0}", R).

What are the downsides of having it?

clang-tools-extra/clangd/unittests/FindTargetTests.cpp
498 ↗(On Diff #221345)

That's the next character in Code we need to process, renamed to NextCodeChar to specifically mention what string we're referring to.

504 ↗(On Diff #221345)

Not anymore (wasn't supposed to in the first place). Added tests too.
See the other comment for details.

600 ↗(On Diff #221345)

Qualifier captures exactly what's written in the source code, so it seems correct: S:: is what written down.
struct comes from the printing function in clang and I don't think it's worth changing. I believe this comes from the fact that qualifier is a type in this case.

Why would the qualifier be None? Maybe the purpose of this field is unclear?

638 ↗(On Diff #221345)

There are no qualifiers written in the source code in any of the two references, therefore both qualifiers are empty.
Again, please let me know if the purpose of Qualifier is unclear. Happy to comment or change to make its purpose clear.

ilya-biryukov marked 3 inline comments as done.
  • Use DeclRefExpr::getFoundDecl, add a test
  • Update a comment
  • Remove redundant anon namespace declaration
  • Remove :wa
  • Do not filter-out macro references
  • Document why we drop references with invalid locations, log if that happens
  • Rename NextOffset to NextCodeChar
kadircet marked 4 inline comments as done.Sep 24 2019, 6:35 AM
kadircet added inline comments.
clang-tools-extra/clangd/FindTarget.h
96 ↗(On Diff #221345)

I wasn't happy about it because it actually needs a SourceManager to print completely, and I believe that's also necessary for debugging.

But feel free to ignore if you're OK with that.

clang-tools-extra/clangd/unittests/FindTargetTests.cpp
498 ↗(On Diff #221345)

well, this looked more like "beginning of the last token we processed" to me. but now I see your point of view as well. Feel free to use whichever you want.

600 ↗(On Diff #221345)

ah sorry, I missed the fact that this was written as S::type in the code, now it makes sense. just ignore this one.

638 ↗(On Diff #221345)

yeah it totally makes sense now, the problem started with me missing the S:: in the previous example :D, nvm..

ilya-biryukov marked 8 inline comments as done.
  • Remove ExprMemberBase
  • Add an overload for Decl*

I believe all comments were addressed. PTAL and let me know if I missed something.

clang-tools-extra/clangd/FindTarget.h
96 ↗(On Diff #221345)

Having some llvm::toString is great, because it's nicely integrated to all of the infrastructure and very easy to use.
I'll stick with it for now, despite the limitations arising from the absence of source manager.

93 ↗(On Diff #220999)

Thanks, this is definitely a valid concern. I was overthinking the problem, removed it.

115 ↗(On Diff #220999)

I would avoid doing DynTypedNode - many of the cases don't quite make sense here, e.g. TypeLoc is arguably ok as it has location information, but Type* or QualType are not ok as they lack location information.

Decl definitely sounds useful, though. Would allow to visit all top-level decls of a file, which is another use-case we probably want. Added that.

kadircet accepted this revision.Sep 25 2019, 1:10 AM

LGTM, thanks!

clang-tools-extra/clangd/FindTarget.h
103 ↗(On Diff #221556)

nit: s/Decl *S/Decl *D/

This revision is now accepted and ready to land.Sep 25 2019, 1:10 AM
ilya-biryukov marked an inline comment as done.
  • s/Decl* S/Decl* D/g
This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptSep 25 2019, 5:39 AM