This is an archive of the discontinued LLVM Phabricator instance.

[clangd][CodeComplete] Improve FunctionCanBeCall
ClosedPublic

Authored by zyounan on Jul 29 2023, 9:03 PM.

Details

Summary

From two aspects:

  • For function templates, emit additional template argument

placeholders in the context where it can't be a call in order
to specify an instantiation explicitly.

  • Consider expressions with base type specifier such as

'Derived().Base::foo^' a function call.

Diff Detail

Event Timeline

zyounan created this revision.Jul 29 2023, 9:03 PM
Herald added a project: Restricted Project. · View Herald TranscriptJul 29 2023, 9:03 PM
zyounan published this revision for review.Jul 29 2023, 9:11 PM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptJul 29 2023, 9:11 PM
Herald added a subscriber: cfe-commits. · View Herald Transcript
zyounan updated this revision to Diff 547566.Aug 6 2023, 5:18 AM

Merge changes with function templates from D155370

zyounan retitled this revision from [CodeComplete] Mark 'Derived().Base::foo()' CanBeCall to [clangd][CodeComplete] Improve FunctionCanBeCall.Aug 6 2023, 5:20 AM
zyounan edited the summary of this revision. (Show Details)
zyounan edited the summary of this revision. (Show Details)Aug 6 2023, 5:20 AM
zyounan updated this revision to Diff 547569.Aug 6 2023, 5:24 AM

Trigger the build

Thanks for the patch, these are nice improvements

clang-tools-extra/clangd/CodeCompletionStrings.cpp
258 ↗(On Diff #547569)

It looks like we are assuming two things:

  1. Any LeftParen in an RK_Declaration starts a function argument list
  2. Everything that comes after the function argument list can be discarded

I think these assumptions are fine to make, but let's be explicit about them in a comment

clang-tools-extra/clangd/unittests/CodeCompleteTests.cpp
536

Suggestion: it would be slightly more interesting to make the function signature generic(U). Then, the function can be called as generic<T>(u) (with the template parameter U deduced), and the LastDeducibleArgument logic should make sure that only <T> is added to the code completion string, not <T, U>.

580

Since you're updating this test anyways, could you add signature() matchers for the non-template cases as well please?

clang-tools-extra/clangd/unittests/CodeCompletionStringsTests.cpp
178 ↗(On Diff #547569)

Maybe add:

// Arguments dropped from snippet, kept in signature

so someone reading the test knows it's deliberate

clang/lib/Sema/SemaCodeComplete.cpp
345

The word "base" is a bit overloaded in this signature; in the parameter InBaseClass it refers to inheritance, but in BaseType it refers to the base expression of a MemberExpr.

Maybe we can name the new parameter BaseExprType to avoid confusion?

1285

Should we propagate BaseType into this recursive call?

1425

Is there any situation where MaybeDerived == MaybeBase || MaybeDerived->isDerivedFrom(MaybeBase) is false?

I am wondering if we can simplify this to:

if (!BaseType.isNull()) {
  R.FunctionCanBeCall = true;
}
3590
  1. If this is not a call, we can avoid running the Sema::MarkDeducedTemplateParameters operation altogether.
  1. A future improvement we could consider: if this is not a call, try to detect cases where the template parameters can be deduced from the surrounding context ("Deducing template arguments taking the address of a function template "). Maybe add a FIXME for this?

Thanks, this does look better.

I agree with Nathan's comments and will leave final stamp to him.

clang-tools-extra/clangd/CodeCompletionStrings.cpp
144 ↗(On Diff #547569)

I find this comment very unclear, I think because it starts with implementation details and works its way up to general principles.

I think Buffer that snippet chunks are written to once we've decided to discard the snippet due to DropFunctionArguments or so would be enough.

However I'd rather drop this variable entirely if possible, the data flow is confusing. See below.

148 ↗(On Diff #547569)

For the reason you give here, it's actually *not* preferable. So I'd suggest leaving out this comment.

258 ↗(On Diff #547569)

Agree, but also I think the code could reflect this more directly:

  • this should trigger only for CK_LeftParen, not CK_RightParen

Rather than redirecting output to a different string by reassigning the param, I think it would be a bit more direct to have

optional<unsigned> TruncateSnippetAt;
...
case CK_LeftBracket:
  if (DropFunctionArguments && ... && !TruncateSnippetAt)
    TruncateSnippetAt = Snippet->size();
...
if (TruncateSnippetAt)
  Snippet->resize(*TruncateSnippetAt);
}

(though still not totally clear)

clang-tools-extra/clangd/CodeCompletionStrings.h
45 ↗(On Diff #547569)

Prefer positive sense for bool params (IncludeFunctionArguments) to avoid double-negative confusion

clang/lib/Sema/SemaCodeComplete.cpp
1387

This is confusing: the CCC_Symbol test is part of the specific heuristics being used (it's the "not via dot/arrow member access" part, right?) but you've moved the comment explaining what it does.

Also this function is just getting too long, and we're inlining more complicated control flow here.
Can we extract a function?

const auto *Method = ...;
if (Method & !Method->isStatic()) {
  R.FunctionCanBeCall = canMethodBeCalled(...);
}
1417

This description is hard for me to follow, and it's hard to tell if this is independent of the previous check, or an exception to it.

An example would be clearer. (I think instead rather than in addition):
Exception: foo->FooBase::bar() *is* a call.

1420

if (const CXXRecordDecl *BaseDecl = BaseType ? BaseType-> getAsCXXRecordDecl() : nullptr)?
avoiding the reassignment inside a boolean expression

zyounan updated this revision to Diff 551723.Aug 19 2023, 2:17 AM
zyounan marked 14 inline comments as done.

Address many comments.

Thanks for Sam and Nathan's thorough review! I've updated, and please take another look.
(Apologize for churning on weekends, but I don't have chunk time during the week.)

clang-tools-extra/clangd/CodeCompletionStrings.cpp
258 ↗(On Diff #547569)

It looks like we are assuming two things

Honestly, I don't quite prefer this assumption. However, we have lost some of the semantics e.g., the structure of a call expression within this function, and unfortunately it's not trivial to pass these out from clang. :-(

(though still not totally clear)

Yeah. I imagine a clearer way would be separating the calculation for Signature and Snippet, and we could bail out early for Snippet. But I'm afraid that making so many adjustments to getSignature for such a small feature is inappropriate.

Assuming the left parenthesis is the beginning of a call might be sufficient for the time being, and let's leave the refactoring to subsequent patches.

clang-tools-extra/clangd/unittests/CodeCompleteTests.cpp
580

Of course!

clang/lib/Sema/SemaCodeComplete.cpp
345

Makes sense to me. Thanks for the correction.

1285

Yes. And thanks for spotting this. I added another test case reflecting it.

1387

it's the "not via dot/arrow member access" part

(Sorry for being unaware of the historical context). But I think CCC_Symbol should mean "we're completing a name such as a function or type name" per its comment. The heuristic for dot/arrow member access actually lies on the next line, i.e., if the completing decl is a CXXMethodDecl.

Can we extract a function?

Sure.

1425

I think this could prevent us from completing ill-formed expressions like:

struct B {
  void foo();
};

struct D : B {
  void bar();
};

struct S {
  void method();
};

void baz() {
  D().S::^
}

Currently, we have the candidate method, and completing it with parentheses is definitely wrong. (Although we shouldn't be providing method as well.)

3590

avoid running the Sema::MarkDeducedTemplateParameters operation altogether

I think doing so could cause too many adjustments to the flow, and I'm afraid that the Sema::MarkDeducedTemplateParameters would be required again when we decide to implement point 2.

I'm adding a FIXME anyway but leave the first intact. However, I'd be happy to rearrange the logic flow if you insist.

zyounan updated this revision to Diff 551730.Aug 19 2023, 2:59 AM

Fix the bool expression.

zyounan updated this revision to Diff 551736.Aug 19 2023, 3:58 AM

Sorry, forgot to update tests :(

Gently ping~

(I'm away on travels, will get back to this within the next week)

nridge requested changes to this revision.Sep 5 2023, 12:58 AM
nridge added inline comments.
clang/lib/Sema/SemaCodeComplete.cpp
1297

nit: we can avoid the computation in this block if FunctionCanBeCall was already initialized to true above

1387

The check for CompletionContext.getKind() is in fact a part of the heuristic:

  • for f.method, the kind is CCC_DotMemberAccess
  • for f->method, the kind is CCC_ArrowMemberAccess
  • for f.Foo::method and f->Foo::method, the kind is CCC_Symbol

(I realize that's a bit inconsistent. Maybe the f.Foo:: and f->Foo:: cases should be using DotMemberAccess and ArrowMemberAccess as well? Anyways, that's a pre-existing issue.)

So, the if (CompletionContext.getKind() == clang::CodeCompletionContext::CCC_Symbol) check is what currently makes sure that in the f.method and f->method cases, we just keep FunctionCanBeCall = true without having to check any context or expression type.

I think it may be clearest to move this entire if block into the new function (whose name can be generalized to canBeCall or similar), and here just unconditionally set R.FunctionCanBeCall = canBeCall(CompletionContext, /* other things */).

3590

I realized there's actually an open question here: if FunctionCanBeCall == false, do we want to include all the template parameters, or just the non-deducible ones?

Let's consider an example:

class Foo {
  template <typename T, typename U>
  T convertTo(U from);
};

void bar() {
  Foo::^
}

Here, U can be deduced but T cannot.

The current behaviour is to complete convertTo<T>. That seems appropriate for a call, since U will be deduced from the call. But since this is not a call, wouldn't it be better to complete convertTo<T, U>?

This revision now requires changes to proceed.Sep 5 2023, 12:58 AM
zyounan added inline comments.Sep 9 2023, 11:10 PM
clang/lib/Sema/SemaCodeComplete.cpp
3590

But since this is not a call, wouldn't it be better to complete convertTo<T, U>?

(Ugh, this is exactly this patch's initial intention, but how...?)

Oh, I made a mistake here: I presumed LastDeducibleArgument would always be 0 if we're in a non-callable context, but this is wrong! This variable is calculated based on the function parameters, which ought to be equal to the number of template parameters deducible in function parameters.

( I used to duplicate this if block for the !FunctionCanBeCall case but finally, I merged these two erroneously. :-( )

zyounan marked an inline comment as not done.Sep 9 2023, 11:11 PM
zyounan updated this revision to Diff 556909.Sep 17 2023, 2:37 AM
zyounan marked 2 inline comments as done.

Address comments

Thank you Nathan for your constructive opinions! I've updated this patch again, hopefully this becomes better.

clang/lib/Sema/SemaCodeComplete.cpp
1387

I think it may be clearest to move this entire if block into the new function

Nice suggestion. I'll take it.

3590

A future improvement we could consider: if this is not a call, try to detect cases where the template parameters can be deduced from the surrounding context ("Deducing template arguments taking the address of a function template "). Maybe add a FIXME for this?

UPD: I've looked into the deduction implementation and found it not difficult to add such a heuristic: We have PreferredType that describes the type in a declaration, e.g., void (*)(int, int) and DeduceTemplateArgumentsByTypeMatch in Sema to tell if the template parameter type P could be deduced from the 'synthesized' type A (That is, the declaring type void (int, int), for example).

I suppose we could apply such a deduction for each candidate and eliminate its template parameters in the result if the deduction succeeds - But let me leave that for the next patch(es) so I don't overload you. :-)

nridge accepted this revision.Sep 25 2023, 12:43 AM

Thank you, the updates look good! Please go ahead and merge after addressing the last minor nits.

clang-tools-extra/clangd/unittests/CodeCompleteTests.cpp
588

It seems a bit inconsistent that the signature includes parameters with default arguments in the non-call case, and not in the call case. I guess the reason for this is that in the non-call case, the parameters with defaults become a CK_Optional chunk which clangd adds to the signature here, while in the call case the deduced parameters (which include the ones with defaults) are omitted from the completion string altogether.

It may be more consistent to put the deduced parameters into an optional chunk as well, and let the consumer of the CodeCompletionString decide whether to use them?

Anyways, that's an idea for the future, no change needed now.

clang/lib/Sema/SemaCodeComplete.cpp
1283

nit: let's shorten this comment to just "If we're not inside the scope of the method's class, it can't be a call"

(The parts about non-static and dot/arrow member access are checked and explained in canFunctionBeCalled())

3550

nit: "We'd" --> "We should"

This revision is now accepted and ready to land.Sep 25 2023, 12:43 AM
zyounan marked 3 inline comments as done.Sep 28 2023, 4:16 AM

Thank you guys again for the long and meticulous review! :)

For a quick note, here are points I've excerpted from Nathan's reviews that need improvements later:

  1. Implement a heuristic that works for function template pointers whose template parameters could be deduced from the declaring type. https://reviews.llvm.org/D156605#inline-1546819
  1. Currently, the deducible template parameters are discarded by default; it may be more consistent to put the deduced parameters into an optional chunk as well, and let the consumer of the CodeCompletionString decide whether to use them. https://reviews.llvm.org/D156605#inline-1548400
clang-tools-extra/clangd/unittests/CodeCompleteTests.cpp
588

It seems a bit inconsistent that the signature includes parameters with default arguments in the non-call case, and not in the call case.

Indeed. If we just want consistent behavior (i.e., both get omitted) for default parameters, it is easy to fix at the moment. Note that

  1. the loop for default template params will run iff the bit vector Deduced is not empty.
  1. the vector would be resized in Sema::MarkDeducedTemplateParameters to fit the template params, which has been avoided in my patch to emit all template params for non-call cases. As a result, LastDeducibleArgument would always be 0 for non-calls, even with default template params.

We can change the Deduced to a default initialized N-size bit vector, where N is the size of template params. This way, the default params can be omitted in the loop as desired (by reducing LastDeducibleArgument to limit the range it would dump to the CCS chunk later), while non-default params get preserved.

It may be more consistent to put the deduced parameters into an optional chunk as well, and let the consumer of the CodeCompletionString decide whether to use them?

Sounds reasonable.

zyounan updated this revision to Diff 557441.Sep 28 2023, 4:22 AM

Address nits; discard default template parameters

zyounan updated this revision to Diff 557442.Sep 28 2023, 4:23 AM

Sorry, wrong patch

zyounan updated this revision to Diff 557443.Sep 28 2023, 5:44 AM

Fix the CI

This revision was automatically updated to reflect the committed changes.