This is an archive of the discontinued LLVM Phabricator instance.

[clangd] DefineInline action apply logic with fully qualified names
ClosedPublic

Authored by kadircet on Aug 23 2019, 5:42 AM.

Details

Summary

Initial version of DefineInline action that will fully qualify every
name inside function body. It has some problems while qualyfying dependent
template arguments, see FIXME in the unittests.

Diff Detail

Event Timeline

kadircet created this revision.Aug 23 2019, 5:42 AM
ilya-biryukov added inline comments.Sep 19 2019, 1:25 AM
clang-tools-extra/clangd/refactor/tweaks/DefineInline.cpp
165

Are we trying to replace the whole name?

We can avoid this problem by not qualifying template arguments in the first place.
I.e. instead of replacing the whole name, including template arguments:

[[baz<Bar>]] -> [[::ns::baz<Bar>]]

We could simply replace the part before template arguments:

[[baz]]<Bar> -> [[::ns::baz]]<Bar>
kadircet updated this revision to Diff 221912.Sep 26 2019, 4:18 AM
  • Rebase and update testhelpers
kadircet updated this revision to Diff 221913.Sep 26 2019, 4:20 AM
  • Mark tweak as visible

Currently we add too many qualifiers in some cases, e.g. here's a common pattern in clangd:

// foo.h
#include "Decl.h"
namespace clang::clangd { std::string printName(Decl*D); }

// foo.cpp
#include "foo.h"
namespace clang::clangd { 
  std::string printName(Decl *D) {
    NamedDecl* ND = ...;
  } 
...
}

this will result in:

// foo.h
namespace clang::clangd {
std::string printName(Decl*D) {
  clang::NamedDecl *ND = ...; // <-- (!) we did not even had any 'using' declarations or 'using namespace's 
}
}

It's ok to leave this out of the initial change, but could we describe our strategy to tackle this somewhere in the comments - how we want to fix this and when.

clang-tools-extra/clangd/unittests/TweakTesting.h
81–82

It seems weird that we have this inconsistency between the contents for current file (passed through the return value) and contents for other files (pass through the fields).

A few better alternatives that come to mind:

  1. add an out parameter, could be optional in case when all changes are in the main file.
  2. this function will fail when there are changes outside main file, but we can add a new function to return changes in all modified files, e.g. as StringMap<std::string> or vector<pair<Path, string/*Content*/>> (the latter allows to use standard matchers)

We also need to rename parameters sometimes, right?

// Sometimes we need to rename parameters.
void usages(int decl_param, int);

void usages(int def_param, int now_named) {
  llvm::errs() << def_param + now_named;
}

// And template parameters! (these are even more interesting)
template <class T>
struct Foo {
  template <class U, class>
  void usages();
};
template <class L>
template <class R, class NowNamed>
void Foo<L>::usages() {
  llvm::errs() << L() + R() + NowNamed();
}
clang-tools-extra/clangd/refactor/tweaks/DefineInline.cpp
178

I don't think it's true in presence of using declarations:

namespace ns {
  void foo(int*);
}
namespace other_ns {
  void foo(char*);
}

using ns::foo;
using other_ns::foo;

template <class T>
void usages() {
  foo(T()); // <-- would we add `ns::foo` or `other_ns::foo`?
}

Not qualifying in this case is probably ok, but adding any of the qualifiers is probably wrong.
Could you check what we do in this case and either add to tests or put into a list of known issues (e.g. file a bug or at least add a FIXME?)

184

NIT: could you add an example of a member expression here?

Note that sometimes there is no left side (it's implicit), but in that case we shouldn't qualify either.

185

I believe we could also avoid qualifying anything from non-namespace scope.
This follows from the fact that one cannot move stuff from class to namespace with using declarations:

namespace ns {
class X {
  static void foo();
  void usages();
};
}

using ns::X::foo; // <- not allowed!
void ns::X::usages() {
  foo(); // <-- so no need to qualify this!
}

If I'm reading the code correctly, we will get ns::foo() in the current version.

254

Not sure if this covers these two cases (could you please add them to the tests?):

template <class T>
struct vector {
  void foo();
};

template <class T>
void vector<T>::foo() {}

template <class T>
struct list {
  template <class U>
  void foo();
}

template <class T>
template <class U>
void list<T>::foo() {}

Or are we disabled in those cases?

335

This never fails since there are no replacements to conflict with, right?
Maybe we could store a single tooling::Replacement here instead? That should simplify rest of the code (e.g. no need to clear() replacements and re-use them for different file)

We also need to rename parameters sometimes, right?

// Sometimes we need to rename parameters.
void usages(int decl_param, int);

void usages(int def_param, int now_named) {
  llvm::errs() << def_param + now_named;
}

// And template parameters! (these are even more interesting)
template <class T>
struct Foo {
  template <class U, class>
  void usages();
};
template <class L>
template <class R, class NowNamed>
void Foo<L>::usages() {
  llvm::errs() << L() + R() + NowNamed();
}

This is tricky, and there are also cases where the declaration doesn't provide the name of parameter, e.g. llvm::json::Value toJSON(const CodeAction &);. Maybe we can overwrite the declaration signature with the definition's, or just disable the tweak on these cases.

kadircet updated this revision to Diff 222102.Sep 27 2019, 2:05 AM
kadircet marked 5 inline comments as done.
  • Address comments

It's ok to leave this out of the initial change, but could we describe our strategy to tackle this somewhere in the comments - how we want to fix this and when.

I actually have a fixme saying:

// FIXME: Instead of fully qualifying we should try deducing visible scopes at
// target location and generate minimal edits.

Elaborating on it by saying "we can start by using the namespaces in targetcontext".

clang-tools-extra/clangd/refactor/tweaks/DefineInline.cpp
178

as discussed offline;
action should not be available in that case and we need underlying

clang-tools-extra/clangd/unittests/TweakTesting.h
81–82

It is left-out this way since most of the checks only care about a single file, and there are lots of them so changing would definitely require some plumbing.

I am not sure having the inconsistency between main file and others matters so much though; we have similar discrimination towards non-main files in a bunch of other places in testing infrastructure e.g TestTU.

As for the suggestions, adding another apply function for multifile case doesn't seem so nice, so I would rather move forward with first one, if you believe this is a strong concern.

I actually have a fixme saying:

// FIXME: Instead of fully qualifying we should try deducing visible scopes at
// target location and generate minimal edits.

Elaborating on it by saying "we can start by using the namespaces in targetcontext".

Are we planning to fix this right away or should we keep this indefinitely?
If latter, I believe we should think about some heuristics to avoid qualification in the short term instead.
If we want to do it right away, that's great! (albeit more complicated, I guess)

Are we planning to fix this right away or should we keep this indefinitely?
If latter, I believe we should think about some heuristics to avoid qualification in the short term instead.
If we want to do it right away, that's great! (albeit more complicated, I guess)

That's the end goal, but I am currently working on a patch that will at least strip away the common namespaces
of function declaration and symbols used inside.

ilya-biryukov added inline comments.Sep 27 2019, 2:36 AM
clang-tools-extra/clangd/unittests/TweakTesting.h
81–82

It is left-out this way since most of the checks only care about a single file, and there are lots of them so changing would definitely require some plumbing.

Agree that no matter what we do, we should let the current usages stay the same.

I am not sure having the inconsistency between main file and others matters so much though; we have similar discrimination towards non-main files in a bunch of other places in testing infrastructure e.g TestTU.

What kind of inconsistency are you referring to? Both main file and extra files are fields inside TestTU? It outputs ParsedAST and it's a return value of the build() function.
There are no multiple ways to return results.

As for the suggestions, adding another apply function for multifile case doesn't seem so nice, so I would rather move forward with first one, if you believe this is a strong concern.

Yeah, totally ok with this. I'm not a big fan of out parameters myself and prefer putting stuff into the return value instead.
But they're still widely used, so I view this as my preference rather than a common convention (sigh...)

kadircet updated this revision to Diff 222180.Sep 27 2019, 8:33 AM
  • Add renaming of template and function parameters

We also need to rename parameters sometimes, right?

// Sometimes we need to rename parameters.
void usages(int decl_param, int);

void usages(int def_param, int now_named) {
  llvm::errs() << def_param + now_named;
}

// And template parameters! (these are even more interesting)
template <class T>
struct Foo {
  template <class U, class>
  void usages();
};
template <class L>
template <class R, class NowNamed>
void Foo<L>::usages() {
  llvm::errs() << L() + R() + NowNamed();
}

So currently AST doesn't store any information regarding template parameter locations except the deepest one.
Therefore I've changed the availability to discard any methods inside templated classes, since there is no way to
validate the template parameter names. Hopefully this should be a rare use-case, and I believe most of the times
people have same template paramater names on declaration and definition. Let me know what you think about
it.

In addition to that implemented renaming for function parameters and template parameters in case of templated
functions.

So currently AST doesn't store any information regarding template parameter locations except the deepest one.
Therefore I've changed the availability to discard any methods inside templated classes, since there is no way to
validate the template parameter names. Hopefully this should be a rare use-case, and I believe most of the times
people have same template paramater names on declaration and definition. Let me know what you think about
it.

Thanks! LG, I also think this should be a rare case. The deepest template parameters are typically the only ones anyway.

In addition to that implemented renaming for function parameters and template parameters in case of templated
functions.

Thanks, will take a look!

kadircet updated this revision to Diff 222562.Oct 1 2019, 2:01 AM
  • Fix getSemiColon to handle semicolons at the end of file
kadircet updated this revision to Diff 222892.Oct 2 2019, 12:41 PM
kadircet marked 5 inline comments as done.
  • Update test helper to take an out parameter
ilya-biryukov added inline comments.Oct 4 2019, 1:40 AM
clang-tools-extra/clangd/refactor/tweaks/DefineInline.cpp
68

"Lexes" is probably a not very relevant implementation detail.
I'd say this function probably doesn't need a comment, it's clear what it does from its name.

168

NIT: Maybe store Error directly?
It would communicate the intent of the variable more clearly.

174

NIT: IIUC, this could be simply if (Ref.Qualifier)

181

As discussed before, Targets can actually come from different scopes, e.g. from ADL.

182

It just occurred to me that this is a really bad idea. I think the order of items in Targets is non-deterministic.
Could we instead try to check if all scopes are the same and only qualify in that case?

195

You would want to use ND->printNestedNameSpecifier() instead to avoid printing inline namespaces:

namespace std { inline namespace v1 { 
 struct vector {};
}}

^-- I believe the current code would print std::v1:: for vector and we want std::.
Could you add this example to tests too?

198

Maybe return a generic error in the end instead of the last error?
Something like createStringError(..., "Failed to compute qualifiers, see errors in the logs for more details").

Should be a better UX for anyone who tries to explore what went wrong.

222

NIT: mention that both function parameters and template parameters are updated here.

225

This does not rename any references to those parameters, right?
E.g.

template <class T> void foo(T x);

template <class U> void foo(U x) {}

would turn into

template <class U> void foo(T x);

right?

kadircet updated this revision to Diff 223620.Oct 7 2019, 9:31 AM
kadircet marked 9 inline comments as done.
  • Address comments
kadircet added inline comments.Oct 7 2019, 9:38 AM
clang-tools-extra/clangd/refactor/tweaks/DefineInline.cpp
195

it is the opposite, printNamespaceScope skips anonymous and inline namespaces, whereas printNestedNameSpecifier also prints those. adding a test.

225

yes that's right, just adding a unittest will address this later on

kadircet updated this revision to Diff 224824.Oct 14 2019, 2:58 AM
  • Move parameter renaming logic to a separate patch.
kadircet added inline comments.Oct 14 2019, 3:04 AM
clang-tools-extra/clangd/refactor/tweaks/DefineInline.cpp
225

moving into D68937, since it got bigger than expected.

Mostly comments about tests, the implementation itself LG.

clang-tools-extra/clangd/refactor/tweaks/DefineInline.cpp
68

NIT: s/SemiColon/Semicolon

79

What are the tokens we expect to see before the semicolon here?
It feels safer to just skip comments and whitespace and check the next token is a semicolon.

Any reason to have an actual loop here?

181

NIT: could we mention the tweak name here? to make sure it's easy to detect those lines in the logs:
define inline: targets from multiple contexts

208

NIT: braces are redundant

320–366

NIT: use auto

325

NIT: braces are redundant

clang-tools-extra/clangd/unittests/TweakTesting.cpp
85

Could you document EditedFiles contains all other edited files and the original std::string only contains results for the main file?
Would also be useful to document what is the key in the StringMap: absolute/relative paths or URI?

clang-tools-extra/clangd/unittests/TweakTests.cpp
1131

Why have a comment in this test?
If we need to test we handle comments before the semicolon, could we move to a separate focused test?
If we want to test something else, let's leave out the comment. The test is hard enough to read on its own

1148

Template declarations inside function bodies are not allowed in C++.
Are we getting compiler errors here and not actually testing anything?

1187

Same here: we don't need the comment here.

1190

Either remove using namespace a from the function body or put a FIXME this shouldn't actually qualify?
I believe the first option is what we want, if we're trying to check template arguments are getting transformed properly

1215

Maybe move this matcher to the start of the file?

1255

We prefer to have using ::testing::ElementsAre at the start of the file and not qualify the usage everywhere in our tests.

1302

This is really uncommon, I'm not even sure we should test this.
The other case is much more common and interesting:

template <class T>
void foo(T p);

template <>
void foo<int>(int p) { /* do something */ }

Could we add a test that we don't suggest this action in that case?

1393

Can we test with a single namespace?
Having 3 layers of nested namespace in every test makes reading them very complicated.

Instead, we could validate with separate tests:

  • that the qualification logic works with multiple nested namespaces,
  • that various kinds of references are properly qualified with a single namespace

I believe that would keep both tests more focused

1441

Could you indent here and in other tests? it's hard to follow the fact that Foo is inside b without indentation.
Same for other tests.

Alternatively, put on one line if there's just a single declaration inside the namespace

kadircet updated this revision to Diff 226209.Oct 24 2019, 1:34 AM
kadircet marked 20 inline comments as done.
  • Address comments
clang-tools-extra/clangd/refactor/tweaks/DefineInline.cpp
79

it has been a long time since our offline discussions around this one. but the idea was could have any attributes before semicolon, these could be macros that will expand to one thing in some platforms and another in a different platform. Therefore we've decided to skip anything until we've found a semicolon.

I don't think I follow the second part of the question though? What do you mean by "having an actual loop"?

208

I've thought you were a fan of braces when the inner statement spans multiple lines, even if it is a single statement :D

clang-tools-extra/clangd/unittests/TweakTests.cpp
1148

well, it was still working and inlining the function definition, apparently despite the diagnostic, ranges were correct.

1302

yes there is a test for this in availability checks patch:

EXPECT_UNAVAILABLE(R"cpp(
    template <typename T> void foo();

    template<> void f^oo<int>() {
    })cpp");
1441

no more nested namespaces

ilya-biryukov added inline comments.Oct 24 2019, 1:44 AM
clang-tools-extra/clangd/refactor/tweaks/DefineInline.cpp
79

The loop I mentioned is

while (!CurToken.is(tok::semi))

I believe this could be replaced with

auto NextTok = Lexer::findNextToken(...);
if (NextTok && NextTok->is(tok::semi))
  ...

There are two common attribute applications we should mostly care about:

// Attribute in declaration specifiers.
[[noreturn]] int foo(); 

// Attribute after declarator name
int foo [[noreturn]]();

Both work with a simple if statement, rather than the attributes.

The case we're covering with the loop is super uncommon and I don't think we should even bother to support it:

int foo() [[some::attribute]];
ilya-biryukov marked an inline comment as done.Oct 24 2019, 1:49 AM
ilya-biryukov added inline comments.
clang-tools-extra/clangd/refactor/tweaks/DefineInline.cpp
208

It's more complicated than that... Let me explain...
I'm a fan of braces when:

  1. There are line comments (I perceive those as an extra statement):
if (something) {
  // Here comes a comment...
  return 10;
}
  1. There are composite statements inside the if body:
if (something) {
  for (;;)
    do_something_else();
}

In your case, I'd not use braces...
It's not important, though, pick the style you like most.

clang-tools-extra/clangd/unittests/TweakTests.cpp
1148

I guess we didn't need to do any edits there, so we just carried the text unchanged.
Had we actually needed to change something inside the template, we'd fail to do so.

Range of the function body is correct, since the parser can easily recover in those cases.

1441

Yay! Thanks!

ilya-biryukov marked an inline comment as done.Oct 24 2019, 2:30 AM

A few last NITs and one important comment about handling the case when function definition come from macro expansions

clang-tools-extra/clangd/refactor/tweaks/DefineInline.cpp
323

NIT: s/SemiColon/Semicolon

334

Could we add a test when the semicolon comes from a macro expansion? I believe the action won't be available in that case, which is ok. Just to make sure we cover this corner-case.

#define SEMI ;
void foo() SEMI

Also interesting cases:

  • source function is in a macro expansion
  • target function is in a macro expansion.

I believe they might catch a few bugs here, as we seem to assume all locations are file locations...

clang-tools-extra/clangd/unittests/TweakTesting.h
73

Hm, maybe even call ADD_FAILURE() when there are changes to other files and tests pass null to EditedFiles?
Likely an error in the test.

clang-tools-extra/clangd/unittests/TweakTests.cpp
1226

Argh... Without clang-format that looks SO weird :-)

1266

Maybe put it before using namespace a;?
To make sure the test does not need to change if we actually support not qualifying when not neccessary

1332

How is this test different from the previous one?
Is it just trying to make sure we don't always move to the first function body?

1374

Do we test the following case anywhere?

template <class T> void foo();
template <> void ^foo<int>() { // should not be available
  return;
}
kadircet updated this revision to Diff 226230.Oct 24 2019, 5:28 AM
kadircet marked 15 inline comments as done.
  • Improve macro handling
kadircet added inline comments.Oct 25 2019, 12:02 AM
clang-tools-extra/clangd/refactor/tweaks/DefineInline.cpp
334

see macro tests for the behavior in such cases.

clang-tools-extra/clangd/unittests/TweakTests.cpp
1332

yes that's the point, adding a comment.

1374

Yes, there is once check in availability tests for:

EXPECT_UNAVAILABLE(R"cpp(
  template <typename T> void foo();

  template<> void f^oo<int>() {
  })cpp");
ilya-biryukov accepted this revision.Oct 25 2019, 12:28 AM
ilya-biryukov marked 2 inline comments as done.

LGTM

clang-tools-extra/clangd/refactor/tweaks/DefineInline.cpp
323

This one is still there.

334

s/SemiColon/Semicolon

clang-tools-extra/clangd/unittests/TweakTests.cpp
1332

Thanks!

1374

Ah, nice, I missed it. Thanks for pointing it out.

This revision is now accepted and ready to land.Oct 25 2019, 12:28 AM
kadircet updated this revision to Diff 226388.Oct 25 2019, 2:46 AM
kadircet marked 2 inline comments as done.
  • s/SemiColon/Semicolon/
This revision was automatically updated to reflect the committed changes.