This is an archive of the discontinued LLVM Phabricator instance.

[clangd] Add tweak for implementing abstract class
Needs ReviewPublic

Authored by njames93 on Jan 18 2021, 8:41 PM.

Details

Summary

Add a tweak that will create declarations for all its base classes unimplemented pure virtual methods.

Depends on D95231

Diff Detail

Event Timeline

njames93 created this revision.Jan 18 2021, 8:41 PM
njames93 requested review of this revision.Jan 18 2021, 8:41 PM
njames93 updated this revision to Diff 317456.Jan 18 2021, 8:42 PM

Forgot to add the untracked files

nridge added a subscriber: nridge.Jan 18 2021, 11:29 PM
njames93 updated this revision to Diff 317523.Jan 19 2021, 4:27 AM

Update printing policy.

njames93 updated this revision to Diff 317984.Jan 20 2021, 1:02 PM
  • Replace getFinalOverrides for a manual implementation, that method wasn't quite suited to what was needed w.r.t tracking access.
  • Add support for template classes with no dependant bases.
  • Add tests for template classes.
njames93 updated this revision to Diff 318315.Jan 21 2021, 2:15 PM

Split up the code a little more. Fix a few malformed comments.

njames93 added inline comments.Jan 21 2021, 2:17 PM
clang-tools-extra/clangd/refactor/tweaks/ImplementAbstract.cpp
39–41

Is this a good behaviour, or should we still offer the tweak, even knowing that applying it will still result in the class being marked as abstract?

Thanks this looks great, and something i've been longing for some time! But I got a couple of questions about the how the interaction is designed (sorry if I missed some high level discussions elsewhere, feel free to just point me in that direction).

  • Why do we just declare methods, rather than defining them too (with dummy bodies) ? This would allow users to move function bodies out-of-line if they want to easily. Also it would enforce them to fill in the bodies, rather than forgetting/skipping some of the methods. (Even more maybe we should take the extra step and offer another action to declare the function inline and define it out-of-line, but I think that could be done iteratively if we see the need)
  • Implementing all (pure) virtuals vs offering a code action for each possible method. It is unfortunate that our existing tweak infrastructure doesn't enable a single tweak to output multiple code actions, but I believe in this case we might achieve a whole lot better UX if we did offer implementing each method one by one, while possibly still suggesting implement "all" overrides or "only" pures. It is still useful in its current form ofcourse, as the user can manually change the declaration into a pure one, or delete it, but it sounds cumbersome if they only want to implement a small subset of all possible pure methods .
  • Acting on non-pure virtuals too, i believe this is also a quite common use case that could benefit a lot of users. but this definitely increases the need for a code action per override.
  • When to trigger? I suppose what you have in the code ATM makes sense (e.g. just on [[class X]] : ... [[{]] [[}]]) but it would be great to spell it out explicitly so that others have a chance to raise concerns. I don't think triggering within the body would be useful though, as users are likely to navigate within class body for various reasons, and spamming them with lots of codeactions at each cursor move is likely to be annoying.
  • When to land this :) we are at the edge of a branch cut, and tweaks are a feature triggered automatically by editors. so any crashes in there are likely to make clangd useless (as they'll persist no matter what). so I believe we should either wait for release cut, and land this afterwards and fixing any crash reports we get until next release, or include it in current release while marking the tweak as hidden so that it will only annoy experimental users.

Thanks this looks great, and something i've been longing for some time! But I got a couple of questions about the how the interaction is designed (sorry if I missed some high level discussions elsewhere, feel free to just point me in that direction).

There was no high level discussions on my end. I'm just a guy on my own, who makes what I like, I don't even work in software. Though if there is a public place where people seem to discuss these I'd like to be pointed in the direction(discord seems to be mainly used for user support and clangd-dev has only had one relevant post in the last 6 months). Right now seems that GitHub issues and here are my best bets.

  • Why do we just declare methods, rather than defining them too (with dummy bodies) ? This would allow users to move function bodies out-of-line if they want to easily. Also it would enforce them to fill in the bodies, rather than forgetting/skipping some of the methods. (Even more maybe we should take the extra step and offer another action to declare the function inline and define it out-of-line, but I think that could be done iteratively if we see the need)

I'm still unsure of the best way to go about that, Trouble with just defining empty methods is we will undoubtedly generate code that can't compile (without warnings), Mostly due to methods not having return statements or if we try and fix that, how do you implement the return when the function needs to return a reference.

  • Implementing all (pure) virtuals vs offering a code action for each possible method. It is unfortunate that our existing tweak infrastructure doesn't enable a single tweak to output multiple code actions, but I believe in this case we might achieve a whole lot better UX if we did offer implementing each method one by one, while possibly still suggesting implement "all" overrides or "only" pures. It is still useful in its current form ofcourse, as the user can manually change the declaration into a pure one, or delete it, but it sounds cumbersome if they only want to implement a small subset of all possible pure methods .
  • Acting on non-pure virtuals too, i believe this is also a quite common use case that could benefit a lot of users. but this definitely increases the need for a code action per override.

As code completion can help implement just one virtual method, not much is gained by offering to implement methods one by one.
Then there's the issue of say if a class has 50 virtual methods, Having 50 different refactoring show up in the UI is likely going to be some issue.
There's a discussion of adding a refactoring related methods to LSP which would enable a more interactive experience. There hasn't been an agreed design yet though. If that does make it through it would definitely extend the usefulness of this. Presenting a Interface to the user where they could choose exactly what methods they want to implement as well as letting them override already implemented virtual functions.

  • When to trigger? I suppose what you have in the code ATM makes sense (e.g. just on [[class X]] : ... [[{]] [[}]]) but it would be great to spell it out explicitly so that others have a chance to raise concerns. I don't think triggering within the body would be useful though, as users are likely to navigate within class body for various reasons, and spamming them with lots of codeactions at each cursor move is likely to be annoying.

Fair point. Regarding the base specifier, Currently it doesn't offer the tweak when over the base specifier, I may be inclined to tweak behaviour so If you are over the base specifier, then only offer to implement pure virtual methods from that base class. Although rather annoyingly, The selection considers the public|private|protected|virtual part of a base specifier as part of the derived class, Instead of being part of the base specifier. I'm gonna push a separate patch to address that though.

  • When to land this :) we are at the edge of a branch cut, and tweaks are a feature triggered automatically by editors. so any crashes in there are likely to make clangd useless (as they'll persist no matter what). so I believe we should either wait for release cut, and land this afterwards and fixing any crash reports we get until next release, or include it in current release while marking the tweak as hidden so that it will only annoy experimental users.

I couldn't even rush this and get it ready before Tuesday :) Definitely not gonna make the 12 cut, even in experimental state.

njames93 updated this revision to Diff 318552.EditedJan 22 2021, 9:23 AM

If cursor is over one of the base specifiers, offer to implement only the pure methods from that base class.

njames93 added inline comments.Jan 22 2021, 10:28 AM
clang-tools-extra/clangd/refactor/tweaks/ImplementAbstract.cpp
355

Maybe this should be changed as it currently prints the fully qualified type.

njames93 updated this revision to Diff 318678.Jan 22 2021, 4:25 PM
njames93 edited the summary of this revision. (Show Details)

Fix failing tests.
Updated message for tweak from a specified base class.

njames93 updated this revision to Diff 318932.Jan 25 2021, 2:38 AM

Update getSelectedRecord to work now that BaseSpecifier may contain nested Nodes.

njames93 updated this revision to Diff 319898.Jan 28 2021, 9:29 AM
  • Fix an issue where replacements could conflict with each other.
  • Add support for defining method stubs instead of just declaring them, possibly need a way to configure this behaviour.
  • Change tests to ignore whitespace because we dont clang-format them tweaks in tests.
njames93 updated this revision to Diff 329680.Mar 10 2021, 8:42 AM

Tweak the prepare method to just check for methods that need implementations, this saves some work that could be deferred to the apply method.

Thanks for working on this and sorry for the long delay!

I also think it's really useful, I hope we can simplify the impl a little and land it.

clang-tools-extra/clangd/refactor/tweaks/ImplementAbstract.cpp
2

I'd consider calling this OverrideVirtual.cpp. I think we'll want to support method-by-method triggering in future, and it would share most of the implementation.

(We don't have the infrastructure today, but there are certainly more cases where we want to offer alternate tweaks from the same "class". @kadircet maybe this is relevant to bazel build fixing?)

9

A file comment here would be a great place to describe the functionality and design choices.

e.g.

// A tweak to override inherited virtual methods in a class.
// 
// Currently, supports overriding all pure-virtual methods at once (i.e. implement abstract).
// It would be nice to add per-method overriding actions, but the Tweak interface can't do this today.
26

I know you just added this, but I think it's better to declare only, and hope to compose with a "define method" code action.

Reasons:

  • It's a lot of work to provide sensible defaults: return {} is clever, but unidiomatic for many return types.
  • We can't produce bodies for all return types (e.g. classes with no easy constructor). So we'll produce a bunch of methods that don't compile, which is distracting.
  • Inserting a dummy body that *does* compile places a burden on the user to keep track of it.
  • Inserting *in-line* definitions doesn't really save much typing or much thinking
  • code actions are a pretty simple interaction with few "options". Offering every permutation is unrealistic, and config doesn't seem like an ergonomic alternative. Our best hope IMO is combining sequential code actions.
  • keeps the scope small, smaller code actions are easier to maintain

@kadircet do you find this compelling? (Don't want Nathan caught in the middle :-))

47

how is this different from the hasAnyDependentBases check in isClassOK?

This is called & checked in a lot of places, and makes its way into the return value of various functions. It would be nice to validate upfront instead (and you do appear to do that)

62

It's hard to follow the logic of this function because the actual query logic is combined with recursive AST traversal, early-returns, in-out params etc.

Moreover it's hard to understand its relationship to the other functions here, and it's not clear how to generalize/share the logic (e.g. to support different code actions for all pure functions together, all virtual functions individually etc).

I'd suggest building a data structure first by walking the class hierarchy, and then querying it:

struct VirtualMethods {
  // All virtual methods anywhere in the hierarchy of the class T.
  DenseMap<const CXXMethodDecl *, MethodInfo> Methods;
  struct MethodInfo {
    CXXMethodDecl *OverriddenBy = nullptr;
    AccessSpecifier EffectiveAccess = AS_public; // in T
  };
};

This seems straightforward and cheap-enough to build, and you can easily query it for what we want here (methods not declared in T itself that are pure and not overridden). Later if we want to emit separate code actions for each virtual method, it's easy to reuse for that too.

139

the return value here should be the results, not whether the bases are dependent!

201

I don't think restricting to the methods of the selected base-specifier is worth the effort here and elsewhere.

For this to be useful, we need:

  • the class to have two bases
  • they each have virtual methods
  • the user wants to implement one but not the other
  • and the user is targeting the code action on the base spec

This is rare/obscure and we'd be better spending our complexity budget elsewhere.

Instead, just treating the base specifier as part of the class seems enough.

224

suggest also checking that isBraceRange() is valid and its end point (}) is a file ID not a macro.

That way we can guarantee that the find-insertion-point step never fails: if there's any problem, just insert before }.

241

this is not used outside getInsertionPoints so shouldn't be in the return value

244

why not insert before decls instead of after them? Start locations are pretty reliable.

250

we generally try to avoid running the lexer.
You shouldn't need it at all I think, by inserting before, but TokenBuffer is preferred if so.

263

this function seems more complex than necessary because it's trying to handle all 3 access levels at once.
just take the access spec as a param, return the location, and have the caller call several times?

263

seems a bit cleaner to leave SourceLocation wrangling to the caller that's dealing with edits, and just return a decl_iterator to insert before.

302

seems surprising to handle this as a special case, rather than just keeping track of the effective access as we go.

What about:

AccessSpec CurAccess = class ? private : public;
decl_iterator Best;
enum {None, Bad, Good} Quality = None;
for (I = decls_begin(), E = decls_end(); I != E) {
  Decl &D = *I++;
  if (D is AccessSpecDecl)
    CurAccess = D.getAccess();  ​
 ​if (CurAccess == TargetAccess) {
   ​auto NewQuality = isa<CXXMethodDecl, AccessSpecDecl> ? Good : Bad;
   if (NewQuality >= Quality) {
     Quality = NewQuality;
     Best = I;
   }
 ​}
}
if (Quality == None) {
  Best = decls_end();
  MustInsertSpecifier = true;
}
Herald added a project: Restricted Project. · View Herald TranscriptApr 14 2021, 3:25 PM
kadircet added inline comments.Apr 15 2021, 3:02 AM
clang-tools-extra/clangd/refactor/tweaks/ImplementAbstract.cpp
2

(We don't have the infrastructure today, but there are certainly more cases where we want to offer alternate tweaks from the same "class". @kadircet maybe this is relevant to bazel build fixing?)

Yes we should hopefully have support for those in the near future!

26

I agree 100%.

In addition to all of those, getting linker errors for missing bodies is a lot better than debugging arbitrary runtime misbehaviour due to a defaulted return type.

As a future note on the define method code action, i think rather than trying to generate a compiling definition we should construct a body like:
return /*magic*/; to ensure user gets some diagnostics (at least for non-void functions), that they can use to jump later on.

One thing I noticed was that you need to be over the base class or overriding class identifier in order to get the code action.

For some editors like Visual Studio Code this is less noticeable but for others like VIM it is more noticeable as you need to know it's there.

Is there a way to trigger the tweak when it is anywhere in the class? I have been investigating how to do it but I am still learning a lot about the clang AST and the internals of clangd.

What do other people think?

Might not be worth adding to this patch as it would over-complicate an already complicated patch.

@njames93 I had completely forgotten about this when I attempted the same thing early this year.
I never finished it but wanted to share what I had in case it's useful: http://reviews.llvm.org/D122827. Also happy to finish that sometime if you're not keen on completing this.

Some thoughts based on that:

  • I think it's worth handling both "implement pure-virtuals" and "override virtuals" as the same tweak, choosing based on context. >90% of the work is the same. (I think my patch gets detection wrong though).
  • we now have a library to pick insertion points in a class a bit more declaratively
  • I think you can use getFinalOverriders to avoid a lot of work traversing class hierarchies
  • if you're not using getReturnType().print(..., /*Placeholder=*/Declarator) then functions-that-return-function-pointers will definitely be rendered wrong :-)
Herald added a project: Restricted Project. · View Herald TranscriptMar 31 2022, 8:18 AM

@njames93 I had completely forgotten about this when I attempted the same thing early this year.
I never finished it but wanted to share what I had in case it's useful: http://reviews.llvm.org/D122827. Also happy to finish that sometime if you're not keen on completing this.

Some thoughts based on that:

  • I think it's worth handling both "implement pure-virtuals" and "override virtuals" as the same tweak, choosing based on context. >90% of the work is the same. (I think my patch gets detection wrong though).
  • we now have a library to pick insertion points in a class a bit more declaratively
  • I think you can use getFinalOverriders to avoid a lot of work traversing class hierarchies
  • if you're not using getReturnType().print(..., /*Placeholder=*/Declarator) then functions-that-return-function-pointers will definitely be rendered wrong :-)

I stopped working on this a while ago but happy to resume (I've cherry picked this to the custom clangd build I use and it definitely has some value)
However I think we need a discussion about the ideal interface for this as there is no perfect solution for each use case.
For example where do we trigger the tweak, and do we only override the pure virtual functions or is there value to have overrides for all virtual functions.

njames93 updated this revision to Diff 421371.Apr 7 2022, 5:33 PM

Use new tweak InsertionPoint logic.

Trass3r added a subscriber: Trass3r.Oct 6 2022, 3:39 PM