This is an archive of the discontinued LLVM Phabricator instance.

[ClangRepl] support code completion at a REPL
ClosedPublic

Authored by capfredf on Jul 3 2023, 11:07 AM.

Details

Summary

This patch enabled code completion for ClangREPL. The feature was built upon three
existing Clang components: a list completer for LineEditor,
a CompletionConsumer from SemaCodeCompletion, and the ASTUnit::codeComplete method.
with the first component serving as the main entry point of handling interactive inputs.

Because a completion point for a compiler instance has to be unchanged once it
is set, an incremental compiler instance is created for each code
completion. Such a compiler instance carries over AST context source from the
main interpreter compiler in order to obtain declarations or bindings from
previous input in the same REPL session.

The most important API codeComplete in Interpreter/CodeCompletion is a thin
wrapper that calls with ASTUnit::codeComplete with necessary arguments, such as
a code completion point and a ReplCompletionConsumer, which communicates
completion results from SemaCodeCompletion back to the list completer for the
REPL.

In addition, PCC_TopLevelOrExpression and CCC_TopLevelOrExpression` top levels
were added so that SemaCodeCompletion can treat top level statements like
expression statements at the REPL. For example,

clang-repl> int foo = 42;
clang-repl> f<tab>

From a parser's persective, the cursor is at a top level. If we used code
completion without any changes, PCC_Namespace would be supplied to
Sema::CodeCompleteOrdinaryName, and thus the completion results would not
include foo.

Currently, the way we use PCC_TopLevelOrExpression and
CCC_TopLevelOrExpression is no different from the way we use PCC_Statement
and CCC_Statement respectively.

Diff Detail

Event Timeline

capfredf created this revision.Jul 3 2023, 11:07 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 3 2023, 11:07 AM
capfredf requested review of this revision.Jul 3 2023, 11:07 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 3 2023, 11:07 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript

Can you extend the commit log to include a description of how things are currently done? For example, it would be good to read about design and technical decisions, etc.

clang/lib/Interpreter/CodeCompletion.cpp
79 ↗(On Diff #536848)

Why we drop the error on the floor?

84 ↗(On Diff #536848)

I am not sure why we need to wrap this code. Can we teach the TopLevelStmtDecl to work with the code completion logic?

capfredf added inline comments.Jul 6 2023, 7:38 AM
clang/lib/Interpreter/CodeCompletion.cpp
84 ↗(On Diff #536848)

I think the problem has less to do with the TopLevelStmtDecl. Usually, a TopLevelStmtDecl instance is a successful result from ParseXXXXX, but when CodeCompletion is triggered during parsing, nothing meaning comes out from ParseXXXX.

We need to wrap our input because we need Sema::CodeCompleteOrdinaryName to work on code from the REPL in a different completion context than the same code from a regular source file.

In a regular C++ file,

int foo = 42;
f_

Since top-level expressions are not supported, foo should not be an option.

But in a REPL session,

clang-repl> int foo = 42;
clang-repl> f_

we are allowed to use foo to make a statement like foo + 84;.

Since the argument for CompletionContext to CodeCompleteOrdinaryName is controlled by the parsing process, ie. there is no way to configure it from outside, I think faking context is a straightforward solution.

Alternatively, we can check if the compiler is in incremental mode and call CodeCompleteOrdinaryName with the context we want in parsing. Something like:

if (PP.isIncrementalProcessingEnabled() {
  Actions.CodeCompleteOrdinaryName(getCurScope(), Sema::PCC_Statement);
} else {
  Actions.CodeCompleteOrdinaryName(getCurScope(), Sema::PCC_Namespace);
}

I personally think both solutions are extensionally equivalent, but the alternative is a bit intrusive.

capfredf updated this revision to Diff 537766.Jul 6 2023, 9:21 AM
capfredf edited the summary of this revision. (Show Details)

changes per discussions

capfredf marked an inline comment as done.Jul 6 2023, 9:22 AM
capfredf edited the summary of this revision. (Show Details)
capfredf updated this revision to Diff 537883.Jul 6 2023, 2:26 PM
capfredf edited the summary of this revision. (Show Details)

update

capfredf updated this revision to Diff 537910.Jul 6 2023, 3:40 PM

move the comment up

capfredf updated this revision to Diff 537912.Jul 6 2023, 3:42 PM

move the comment up

capfredf updated this revision to Diff 537915.Jul 6 2023, 3:49 PM

remove an unused decl

capfredf marked an inline comment as not done.Jul 6 2023, 4:01 PM
capfredf added inline comments.
clang/lib/Interpreter/CodeCompletion.cpp
84 ↗(On Diff #536848)

I think the combination of checking isIncrementalProcessingEnabled and adding a new PCC might be better since it is really a different context at the REPL

capfredf updated this revision to Diff 537933.Jul 6 2023, 5:00 PM
capfredf marked an inline comment as not done.

get rid of the wrapping

capfredf edited the summary of this revision. (Show Details)Jul 6 2023, 5:00 PM
capfredf marked an inline comment as done.
v.g.vassilev added a subscriber: aaron.ballman.

@capfredf this seems to be heading in a good direction. Please find another round of comments.

@aaron.ballman, I could not find a person that's active in that area. Could you help with finding reviewers for this type of patch?

clang/include/clang/Interpreter/Interpreter.h
117 ↗(On Diff #537933)

I could not find where this interface was used. If it is unused, let'd drop it.

clang/include/clang/Sema/Sema.h
13461
clang/lib/Interpreter/IncrementalParser.cpp
318

Seems that we have stray debug fragments? Can you remove them?

330

Why we had to comment out this line?

clang/lib/Interpreter/Interpreter.cpp
245

IIUC, setCodeCompletionConsumer takes the ownership of CConsumer. I'd suggest to drop CConsumer member.

capfredf marked 5 inline comments as done.Jul 12 2023, 7:57 AM
capfredf added inline comments.
clang/include/clang/Interpreter/Interpreter.h
117 ↗(On Diff #537933)

it is used at L86 of CodeCompletion.cpp

clang/lib/Interpreter/IncrementalParser.cpp
330

This is old code. FID is created in a different way. See below.
I will remove the commented code

capfredf updated this revision to Diff 539572.Jul 12 2023, 8:11 AM
capfredf marked 2 inline comments as done.
capfredf edited the summary of this revision. (Show Details)
  • rename
  • Do not export ReplCompletionConsumer
  • remove commented code
capfredf updated this revision to Diff 539578.Jul 12 2023, 8:35 AM
capfredf edited the summary of this revision. (Show Details)

@v.g.vassilev I moved the ReplCodeCompletionConsumer class out of the header file, since it is not needed elsewhere.

v.g.vassilev added inline comments.Jul 12 2023, 10:56 AM
clang/include/clang/Sema/CodeCompleteConsumer.h
342
clang/lib/Interpreter/CodeCompletion.cpp
102 ↗(On Diff #539578)

Let's move this in Interpreter::createForCodeCompletion.

118 ↗(On Diff #539578)

Please remove the braces around single statement blocks.

clang/lib/Interpreter/Interpreter.cpp
291

I still do not entirely understand why we can "just" ask the codecompletion infrastructure for the possible options at the current position. I know that's probably easier set than done but I'd like to entertain that idea for a while..

capfredf added inline comments.Jul 13 2023, 7:37 AM
clang/lib/Interpreter/Interpreter.cpp
291

I'm a bit confused. Can you expand on the suggestion?

capfredf added inline comments.Jul 13 2023, 7:46 AM
clang/lib/Interpreter/CodeCompletion.cpp
102 ↗(On Diff #539578)

The reasons I moved this out of Interpreter::createForCodeCompletion are

  1. Design-wise, we don't need to expose ReplCompletionConsumer to Interpreter.cpp. All createForCodeCompletion needs is a subclass instance of CodeCompletionConsumer.
  2. ReplCompletionConsumer has more data members on the type-directed branch than the one on this branch. I'd rather create an instance here than pass all the necessary arguments to one layer down and do that there.
capfredf updated this revision to Diff 541753.Jul 18 2023, 2:59 PM

Remove unnecessary braces

capfredf marked 2 inline comments as done.Jul 19 2023, 5:43 AM
capfredf updated this revision to Diff 542622.Jul 20 2023, 11:49 AM

handle cases where code completion in a function declaration

capfredf marked an inline comment as done.Jul 20 2023, 11:50 AM
capfredf added inline comments.Jul 20 2023, 11:56 AM
clang/unittests/Interpreter/CodeCompletionTest.cpp
71 ↗(On Diff #542622)

@v.g.vassilev The way I fixed this is a bit hacky. Do you have a better idea?

v.g.vassilev added inline comments.Jul 20 2023, 1:19 PM
clang/unittests/Interpreter/CodeCompletionTest.cpp
71 ↗(On Diff #542622)

Now I think I understand what you were after. Instead of handling the error in the constructor we should handle it in the caller. Here is some information how we could do that: https://llvm.org/docs/ProgrammersManual.html#fallible-constructors

capfredf updated this revision to Diff 543000.Jul 21 2023, 11:14 AM

changes per @v.g.vassilev's comments

capfredf marked an inline comment as done.Jul 21 2023, 11:14 AM

I think we are getting there. I added a few suggestions how to improve the current design.

clang/include/clang/Interpreter/CodeCompletion.h
15 ↗(On Diff #543000)

I still am a bit uncomfortable with relying on a line editor api to address a generic question one could as the interpreter such as "What's visible at that code position"?

Can we not make the overloaded operators outside of the ReplListCompleter class. That means moving them in ClangRepl.cpp?

clang/include/clang/Interpreter/Interpreter.h
42 ↗(On Diff #543000)

We should put that in its alphabetical order.

clang/lib/Interpreter/CodeCompletion.cpp
111 ↗(On Diff #543000)

I do not believe we need a spe

clang/lib/Interpreter/IncrementalParser.cpp
305

Is there any other particular reason for this routine to be different rather than PP.SetCodeCompletionPoint(*FE, Line, Col)? That is, can we merge back the old code and only set the point of completion externally?

v.g.vassilev added inline comments.Jul 31 2023, 10:37 AM
clang/lib/Interpreter/Interpreter.cpp
291

I do not believe we need a special case for creating a CodeCompletionConsumer. I think we could rely on the logic in CompilerInstance::createCodeCompletionConsumer which can be triggered in IncrementalAction::ExecuteAction just like in ASTFrontendAction::ExecuteAction. We will need to copy some of the code over but then you could start the same interpreter infrastructure with slightly different flags.

clang/lib/Parse/ParseDecl.cpp
6741

Can we test this change separately outside of clang-repl? That would mean creating a tests and run it with clang -Xclang -fincremental-extensions

clang/lib/Parse/Parser.cpp
934

Likewise, it would be great if we have a test in clang-only mode.

capfredf marked 5 inline comments as done.Aug 1 2023, 8:20 AM
capfredf added inline comments.
clang/lib/Interpreter/CodeCompletion.cpp
111 ↗(On Diff #543000)

I am not sure I am following.
For a regular interpreter, create Interpreter -> create Incremental Parser -> IncrmentalASTAction triggered -> set a default code completion consumer(PrintCodeCompletionConsumer) to CI if no code consumer has been set -> Sema initialized.
Based on this logic, we need to provide our CodeConsumer so that IncrementalASTAction does not create the default, plus we need to a parent CI to a new code completion interpreter. I don't see how we can avoid a special code path for creation

clang/lib/Interpreter/IncrementalParser.cpp
305

The regular Parse does some lexing after parsing, which I don't think is necessary for code completion.

Also, this is more or less because, as a functional programmer, I prefer building things using small functions with explicit arguments.

I'll try if I can use the old parse with the completion point set and tested.

capfredf updated this revision to Diff 547241.Aug 4 2023, 8:48 AM
capfredf marked 2 inline comments as done.

changes per discussions

capfredf marked 3 inline comments as done.Aug 4 2023, 8:48 AM
capfredf updated this revision to Diff 549658.Aug 12 2023, 2:52 PM

use ASTUnit::codeComplete

capfredf added inline comments.Aug 12 2023, 3:01 PM
clang/include/clang/Frontend/ASTUnit.h
906

@v.g.vassilev Not sure if this is the right solution or idiom to extend this method.

we can make this high order function more general, like std::unique_ptr<SyntaxOnlyAction*><()>

capfredf updated this revision to Diff 549739.Aug 13 2023, 12:16 PM

changes per discussions

v.g.vassilev added inline comments.Aug 13 2023, 12:22 PM
clang/lib/Interpreter/IncrementalParser.h
90

Can we not move this class definition locally to the CodeComplete interface?

v.g.vassilev added a subscriber: sammccall.

This looks mostly good to me. I noticed that @sammccall has done some work in that area in clangd and I was wondering if he has some thoughts about this patch.

capfredf abandoned this revision.Aug 13 2023, 1:29 PM
capfredf reclaimed this revision.
capfredf marked an inline comment as done.
v.g.vassilev added inline comments.Aug 13 2023, 1:50 PM
clang/lib/Interpreter/CodeCompletion.cpp
1 ↗(On Diff #549751)

I would propose to rename this file to InterpreterCodeCompletion.cpp and implement the Interpreter::codeComplete interface. Then we can move all of the content of ExternalSource.{h,cpp}, CodeCompletion.h and ExternalSource.h in it. That will increase the encapsulation of the code.

capfredf updated this revision to Diff 549759.Aug 13 2023, 3:07 PM

move everything into InterpreterCodeCompletion.

the entry point for code completion is now a function, codeCompletion

capfredf marked an inline comment as done.Aug 13 2023, 3:12 PM

This mostly looks good.

My main concern is adding code to the parser to suppress errors when code completing: this is unlikely to be the only such place. Existing consumers seem happy to ignore errors, if this doesn't work for clang-repl it'd be good to understand why.

clang/include/clang/Frontend/ASTUnit.h
891

This comment doesn't really say anything beyond echoing the name/type, either remove it or add some separate explanation?

e.g. "If Act is specified, it will be used to parse the file. This allows customizing the parse by overriding FrontendAction's lifecycle methods."

906

There's no need for this to be a SyntaxOnlyAction, you can take FrontendAction here.

I don't think there's any need to provide a factory, it's safe to assume it will always create one action. (We have FrontendActionFactory in tooling, and it's a pain.)

clang/include/clang/Interpreter/InterpreterCodeCompletion.h
30 ↗(On Diff #549760)

nit: lowercase initial letter for functions

30 ↗(On Diff #549760)

you might consider vector<string> instead.

this signature doesn't provide any way to return owned names, and so requires that the results are already allocated somewhere as contiguous strings.

This will prevents completing names like operator int, as well as some non-names that might prove useful.

This is an interactive operation, copying the strings is unlikely to add significant latency on human scale.

clang/include/clang/Sema/CodeCompleteConsumer.h
342

I don't think this name fits with the others, it describes the client rather than the grammatical/semantic context.

I would suggest maybe CCC_TopLevelOrExpression?

clang/lib/Interpreter/InterpreterCodeCompletion.cpp
72 ↗(On Diff #549760)

(up to you, but default between cases seems surprising)

clang/lib/Parse/ParseDecl.cpp
6733

AIUI, all code completion actions should be ignoring errors, and we should always be happy to cut off parsing and give up as soon as the completion token is reached.

So why do we need special handling for incremental processing here?

(i.e. what behavior do non-repl users see here - if it's correct can clang-repl do the same, if it's wrong why don't we fix both?)

capfredf updated this revision to Diff 550489.Aug 15 2023, 2:55 PM
capfredf marked 7 inline comments as done.

address @sammccall 's comments

clang/include/clang/Frontend/ASTUnit.h
906

Ideally, this method requires a FrontendAction with hasCodeCompletionSupport returning true. The reason I chose SyntaxOnlyAction are 1. the old code used it 2. SyntaxOnlyAction::hasCodeCompletionSupport returns true. Certainly, right now SyntaxOnlyAction does not prevent its subclasses from overriding this method.

clang/include/clang/Sema/CodeCompleteConsumer.h
342

Nice. This one sounds better indeed.

@sammccall Thank you very much for your valuable input.

capfredf updated this revision to Diff 551218.Aug 17 2023, 11:57 AM

@sammccall could you take another look? It seems quite ready to me to land.

sammccall accepted this revision.Aug 18 2023, 3:46 AM

LGTM, thanks!

clang/include/clang/Sema/CodeCompleteConsumer.h
341

I think this is non-obvious & worthy of more explanation

/// Completion in a namespace or global scope, but also accept expressions.
/// This occurs in REPL inputs, which may declare things  or print expression
/// values.
This revision is now accepted and ready to land.Aug 18 2023, 3:46 AM
v.g.vassilev accepted this revision.Aug 19 2023, 12:59 AM

Thanks for working on this @capfredf. Let me know if I should land that for you.

capfredf updated this revision to Diff 552418.Aug 22 2023, 10:06 AM
capfredf edited the summary of this revision. (Show Details)

update the commit message

This revision was automatically updated to reflect the committed changes.
capfredf reopened this revision.Aug 23 2023, 8:38 AM
This revision is now accepted and ready to land.Aug 23 2023, 8:38 AM
capfredf updated this revision to Diff 552736.Aug 23 2023, 8:40 AM

fix code in Clangd

Herald added a project: Restricted Project. · View Herald TranscriptAug 23 2023, 8:40 AM
Herald added a subscriber: kadircet. · View Herald Transcript
v.g.vassilev added inline comments.Aug 24 2023, 11:39 PM
clang/unittests/Interpreter/CodeCompletionTest.cpp
38 ↗(On Diff #552736)

We should make this an out parameter instead of returning it by copy.

42 ↗(On Diff #552736)
sammccall added inline comments.Aug 25 2023, 1:57 AM
clang-tools-extra/clangd/CodeComplete.cpp
846

This should be true rather than false, since both TopLevel and Expression are completed using the index.

(Though I realize that for now we'll never hit this context)

capfredf updated this revision to Diff 553759.Aug 26 2023, 1:45 PM

fix potential memory issues

capfredf updated this revision to Diff 553762.Aug 26 2023, 1:52 PM
capfredf marked 4 inline comments as done.
capfredf updated this revision to Diff 554002.Aug 28 2023, 11:22 AM

fix memory issues

This revision was landed with ongoing or failed builds.Aug 28 2023, 1:00 PM
This revision was automatically updated to reflect the committed changes.