This is an archive of the discontinued LLVM Phabricator instance.

[clangd] Add a tweak for adding "using" statement.
ClosedPublic

Authored by adamcz on Mar 19 2020, 8:03 AM.

Details

Summary

This triggers on types and function calls with namespace qualifiers. The
action is to remove the qualifier and instead add a "using" statement at
appropriate place.

It is not always clear where to add the "using" line. Right now we find
the nearest "using" line and add it there, thus keeping with local
convention. If there are no usings, we put it at the deepest relevant
namespace level.

This is an initial version only. There are several improvements that
can be made:

  • Support for qualifiers that are not purely namespace (e.g. record

types, etc).

  • Removing qualifier from other instances of the same type/call.
  • Smarter placement of the "using" line.

Diff Detail

Event Timeline

adamcz created this revision.Mar 19 2020, 8:03 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 19 2020, 8:03 AM

This seems really well-thought-out.
I'm being (even) more verbose than usual about the interesting AST details. Please do push back/defer fixing anything that adds a lot of complexity and doesn't seem important.

clang-tools-extra/clangd/refactor/tweaks/AddUsing.cpp
21

function calls is obsolete I think, I'd just call these references or DeclRefExpr

23

You could put some general comments about the scope/choices here (applicable only to namespaces, insertion strategy)

23

Removing qualifier from other instances of the same type/call.

I think this is valuable enough to be worth calling out in the top-level comment as not-yet-done.
The other extensions you mention seem like we might never get around to them and it'd be fine.

35

is this redundant with Loc.isValid()?

If so, I'd either just use Loc (with a comment), or use optional<SourceLocation> with a comment, to emphasize the relationship.

38

can we get away with always inserting \n and relying on clang-format to clean it up?

45

this doesn't use any state - make it static or a free function?

49

nit: I quite like the abbreviated names for locals but this deserves a real name like Qualifier or QualifierToRemove.

55

Can we make this dynamic? The more clearly we communicate what is targeted, the more confidence the user will have to use it.

title() may only be called after a successful prepare(), so we can use the prepared state.

Maybe something like Add using-declaration for Bar, and remove qualifier?

57

nit: not actually a statement :-(

If "using-declaration" is overly laywerly (and easily confused with using-directive), then maybe we should just spell it out: Add "using foo::Bar" ...

69

no need for the first two conditions, getFileID will give you an invalid fileid and a macro fileid respectively

72

nit: we conventionally leave out the {} on one-line if bodies etc.

73

This check is sufficient for correctness, but we're still going to traverse all the parts of the AST that can't pass this check.

I think you can override TraverseDecl. If the Decl is a DeclContext and it doesn't enclose the selectiondc, then return (pruning that subtree). Otherwise call RecursiveASTVisitor::TraverseDecl and continue as before.

(I think you still want this check though - pruning decls is enough for performance but may not be for correctness)

90

I like the idea here, but I'm not sure it quite works. e.g. any typeloc has a directly enclosing typeloc, the inner one can't be targeted. So what about x::S*? (pointertypeloc > elaboratedtypeloc > recordtypeloc)?

  • looping up from NNS until you get to something else makes sense
  • unwrapping typeloc -> elaboratedtypeloc makes sense, but I don't think you ever want to unwrap multiple levels?
  • i don't think these cases overlap at all, you want one or the other

Am I missing something?

124

I think you also want to check that name is non-null, which would be the case for non-identifier names like foo::operator+(...).

133

One possible hiccup is insertion splitting a comment from its target. Cases I can think of:

  • namespace { // anonymous -> namespace { using foo::bar; // anonymous. This seems rare/unimportant.
  • /* docs */ int firstTopLevelDecl -> /* docs */ using foo::bar; int firstTopLevelDecl. This seems common/bad.
  • /* for widgets */using foo::Qux; -> /* for widgets */ using foo::bar; using foo::Qux. This seems rare/unimportant.

I think you could handle the decl case by calling ASTContext::getLocalCommentForDeclUncached() and using the getBeginLoc() of the returned comment if any.

That said, this is probably something that we should be doing throughout clangd and we should have some common function in AST.h or so to take care of it. It seems fine to leave a comment and we can think about this later. (Or file a bug where we can dump a list of places that should be accounting for the comment range)

145

This is basically always going to be already sorted. AST traversal isn't strictly left-to-right but pretty close (I think the major exceptions are weird objc things). So you could leave this out.

149

ah, neat :-)
but you could check this in the recursive AST visitor instead, and use it to prune the tree.

151

Hmm, I *think* it's probably true that NNS->getAsNamespace() always refers to the canonical (first) NamespaceDecl. If not, you can getCanonicalDecl() on both to ensure this.

162

the tweak infrastructure should do this in any case, it runs clang-format around changed areas (assuming using-decl sorting is on, which it is in the default styles).

do you not see this behaviour? (unit-tests won't exhibit it as the clang-format step is deliberately excluded from them)

207

(Sorry if some of this is obvious: TL;DR: we should use TokenBuffer to handle macros in this case)

Whenever we try to use SourceLocations to reason about written source code, we have to think about macros. Some libraries try to encapsulate this, but the road to confusion is paved with good intentions, and it's often best to decide on the policy and be explicit about it.

For example, when provided a macro location, the tooling::Replacement constructor uses its spelling location.
Given:

// imagine we're trying to abstract away namespaces for C or something
#define NS(X) foo::X
NS(Bar) boo;

Running this action on N would result in changing the definition of the NS macro to delete the "foo::", as that's where the qualifier was spelled!

It would be reasonable (but actually not trivial!) to try to detect any macro usage and bail out. The general case of "is there some sequence of source-code tokens that correspond to this range of preprocessed tokens and nothing else" is pretty hard.

However TokenBuffer does have APIs for this exact question, as it was designed for writing refactorings that replace nodes. I think you want:

  • expandedTokens(NNSL.getSourceRange())
  • spelledForExpanded(ExpandedTokens)
  • Token::range(SM, Spelled.front(), Spelled.back())
  • Replacement("fname", Range.Offset, Range.Length, "")

(ugh, that's really awkward. Maybe we should have a helper in TokenBuffer that combines 1-3)

208

BTW, this is an instance of a common pattern where we have the location of the start of a token (in this case, the last token in the range) and want the token bounds, and end up running the lexer in raw mode to get it! The giveaway is LangOpts which it needs it order to lex.

This is probably not terrible here but it's fiddly and inefficient (when done everywhere) and gets things wrong sometimes. TokenBuffer stores the expanded token stream, the spelled tokens from the main file, and the correspondence between the two, and should generally be used where possible. It's stored in the ParsedAST.

227

Again macros.
This seems a bit more pressing, there's a nontrivial chance that you're inserting inside a namespace which is introduced by invoking a FOO_NS_BEGIN macro.

Unfortunately I don't see a really simple way to insert after this macro - using the expansion location fails if macro doesn't end after the }. (e.g. ANON_NAMESPACE(f::X y;))
(Also you'd need to take care of inserting a space after the macro invocation to avoid FOO_NS_BEGINusing.)

I'd suggest simply skipping over any candidate locations (namespaces or using declarations) that aren't in the main file, using the expansion location of the first-top-level-decl fallback, and asserting here that the location is in the main file. (We use raw assert() in LLVM)

clang-tools-extra/clangd/unittests/TweakTests.cpp
2436

alternately: using ::one::two::ff.
Correct more often, uglier/longer, probably a style preference but not worth having an option for. Pick one :-)

nridge added a subscriber: nridge.Mar 26 2020, 4:28 PM
adamcz updated this revision to Diff 253164.Mar 27 2020, 10:28 AM
adamcz marked 22 inline comments as done.

Addressed review comments

adamcz updated this revision to Diff 253165.Mar 27 2020, 10:29 AM

clang-format

adamcz added inline comments.Mar 27 2020, 10:31 AM
clang-tools-extra/clangd/refactor/tweaks/AddUsing.cpp
35

Yes, it's redundant. I went back and forth on this one, original version used Loc.isValid(). I decided I like being explicit better than having slightly more complicated semantics of the Loc field, but I wasn't too confident about this. Having second opinion helps :-)

I switched to isValid()

38

Nope, that doesn't work. Auto-formatting doesn't seem to remove blank lines between using statements, for example. It's likely to allow you grouping them in some way.

45

It uses Name and NNSL . Would you prefer this as free function, with both passed as arguments. My initial thought was that, since prepare() and apply() already share these via class members this was fine, but I'm certainly not against making this a free function.

57

I think spelling it out may be too long sometimes, namespaces are often long.

Changed to using-declaration

72

Uhh...is that a hard rule? I personally hate that, it's just asking for Apple SSL-style bugs
https://www.imperialviolet.org/2014/02/22/applebug.html

90

If you have a class foo::bar::cc and a struct st inside it, and then do:
foo::bar::c^c::st *p;
you end up with:
PointerTypeLoc -> ElaboratedTypeLoc ->NestedNameSpecifierLoc -> RecordTypeLoc
in which case you need to go up from type to NNSL and then up again, from NNSL to something that's not NNSL.

You have a good point with the PointerTypeLoc, that's a bug. We should stop going up the tree as soon as we find ElaboratedTypeLoc. I added a test for that.

133

Added a comment. I'd be happy to fix that, but I feel it's best done in a separate change.

145

Done. I also added a comment to mention that we assume it's sorted.

162

Didn't know that was the default. I'll drop the comment in that case.

207

You have a good point that this doesn't work well with macros, but I'm not entirely sure how you think this should work.

I'd argue that code actions should refer to the thing under the cursor, not it's expansion. That would be confusing to the user, as they would not be able to understand what the action offered is, nor how it would affect other places. So in your example of
#define NS(X) foo::X
I'd argue that we should not offer the action.
We should, however, support something like:
EXPECT_TRUE(fo^o::bar());
In this case, the qualifier is spelled under the cursor and the fact that EXPECT_TRUE is a macro and not a function should not matter to the user.

I updated the code to support that and added tests.
We can use isMacroID() and isMacroArgExpansion() to filter out macros, except for macro arguments. Note that we can't use spelledForExpanded(), since the qualifier might not be the entire expansion of the macro, thus spelledForExpanded will return None.

Let me know if any of what I did here now doesn't make sense.

227

We already skip over "using" not in main file. I added check for namespace and fixed the above-first-top-level-decl to be above getExpandedLoc() of that, which works for something like a macro declaring a namespace. It's not ideal, but I think it should be good enough, at least for now. Also added a test.

Not sure what the point of assert() here would be? Wouldn't it be far better to return an error, since it's trivial at this point and it won't crash the whole server, including the background index and such? I get that assert() can be useful in places where returning error is difficult and for things that should never, ever happen, but here seems like an overkill.

Anyway, editMainFile() already returns an error when the edit is not in the main file. IMHO that's good enough, what do you think?

clang-tools-extra/clangd/unittests/TweakTests.cpp
2436

The current code keeps the :: if it was there, but won't add it, so basically we respect whatever the user typed. Seems OK to me, do you agree?

I changed one test to include this aspect.

sammccall accepted this revision.Mar 27 2020, 3:57 PM

LG, thanks! Let me know if/when I should land this for you.

clang-tools-extra/clangd/refactor/tweaks/AddUsing.cpp
45

Oops, right. I think since this is the complicated part and it has a fairly simple interface, making its inputs/outputs explicit is nice hygiene.

(prepare and apply do communicate through class state, which feels a bit hacky to me. We didn't come up with anything nicer but still flexible+simple when designing this. It is conventional and documented at least)

72

There's no hard rule, but we do use that style consistently and consistency has value too.

The clang-tidy check readability-misleading-indentation catches that bug. Can I suggest:

  • (for us) adding it to .clang-tidy in llvm-project? This will affect clangd and the phabricator linter
  • (for the world) we could start a list of on-by-default clang-tidy checks for clangd users with no .clang-tidy config, and include this check
90

foo::bar::c^c::st *p;

But this isn't a case we support, right? We only support extraction when the NNS consists entirely of namespaces, and such NNSes don't have any children apart from the qualifier NNS.

105

This is just SourceManager::isMacroBodyExpansion(), I think.

163

I think the endloc check should rather be SM.isWrittenInSameFile - it's no good if e.g. they're both macro arg expansions, but different ones.

171

nit: you've mentioned the two cases I don't think we should bother fixing, but not the crucial one: we're anchoring to a regular decl, and we're going to insert in between the decl and its doc comment.

207

in your example of #define NS(X) foo::X I'd argue that we should not offer the action.

Indeed, sorry I didn't spell that out - I was highlighting it because the original code silently did something (bad) in this case.

can't use spelledForExpanded(), since the qualifier might not be the entire expansion of the macro

Ah, D75446 hasn't landed yet. @kadircet, want to land this soon? :-)

227

Behaviour looks good.

Not sure what the point of assert() here would be?

This is a consistency check - we're relying on an invariant we think we've established elsewhere. If this goes wrong, it's a programmer error, and for better or worse, LLVM defines those as asserts.

Practical benefits:

  • developers are more likely to notice when these are violated (assuming an -UNDEBUG build)
  • we get a stacktrace, which is often useful (not really here). Returning an error doesn't give us that.
  • we don't pay for the check in release builds
  • it documents the distinction between invariant-breaking bugs and dynamically-possible error conditions
  • consistency (in principle we could change all our err-handling, but we can't change this in clang)
245

yikes, this copies the tokenbuffer, I didn't think that was even possible!
auto&

clang-tools-extra/clangd/unittests/TweakTests.cpp
2436

Actually yeah, that's nice and neatly avoids having to make and defend a policy :-)

This revision is now accepted and ready to land.Mar 27 2020, 3:57 PM
sammccall added inline comments.Mar 27 2020, 4:10 PM
clang-tools-extra/clangd/refactor/tweaks/AddUsing.cpp
245

Copy constructor deleted in 94938d7d41cd11c4539ff93b801fe53cb4fddba2

kadircet added inline comments.Mar 28 2020, 12:06 PM
clang-tools-extra/clangd/refactor/tweaks/AddUsing.cpp
207

done, let me know if it doesn't work :D

249

why not use spelledForExpanded in here as well?

adamcz updated this revision to Diff 253583.Mar 30 2020, 6:45 AM
adamcz marked 10 inline comments as done.

review round 2

clang-tools-extra/clangd/refactor/tweaks/AddUsing.cpp
90

Not right now, but I left a comment that we should improve that. The way it is now, code is correct, always, and improving it is just a matter or removing the "no record types" check and replacing with a bit more code. Seems better to me.

207

Works great, thanks!

227

OK, I added the assert(). I'm not against asserts in general, I just find them often insufficient and not a replacement for better error handling. In this case, it certainly won't hurt.

245

Haha, that's a lovely bug. Thanks for fixing the ctor.

249

It didn't work before. Thanks for improving it!

Still LG, should I land this?

adamcz added a comment.Apr 1 2020, 4:51 AM

Still LG, should I land this?

Yes please :)

This revision was automatically updated to reflect the committed changes.