This is an archive of the discontinued LLVM Phabricator instance.

[CodeComplete] an option to suppress endings for header completion
Needs ReviewPublic

Authored by richard9404 on Jul 9 2019, 1:13 AM.

Details

Summary

Header completion in clang always apppends endings to the results, i.e. / for folders and " (or >) for files. But a language-neutral client (like many vim lsp plugins) does not know a subsequent completion is expected, while the auto-inserted / prevents the user from triggering one by manually typing a /. And "/> might collide with auto pair insertion in some clients.

#include "f"
          ^ completion triggered here
       => folder/  <- client is not aware that another completion is expected
       => file.h"  <- '"' collides with the ending '"' inserted by the client

This commit adds an option to suppress the endings, and adds / as a completion trigger in clangd, which makes header completion more similar to object member completion. The downside is that clangd will not be able to distinguish folders from files when setting the completion item kinds.

Diff Detail

Event Timeline

richard9404 created this revision.Jul 9 2019, 1:13 AM
kadircet added inline comments.Jul 9 2019, 1:44 AM
clang-tools-extra/clangd/ClangdLSPServer.cpp
1088

i think instead of regex, it would be better to just perform a linear search to make sure line looks something like this(up to the Offset):

{whitespaces}#{whitespaces}include{whitespaces}{",<}{not {",>}}

IIRC, regex had a bad performance and we would like to trigger as quickly as possible. @sammccall might've more context about this one.

1090

i suppose this if should only enclose the StartOfLine = 0; statement, not the rest?

1092

as i mentioned above, the line doesn't have to start with #, and we cannot expect arbitrary stuff after include, we must make sure current point is inside quotes(") or brackets(<)

Thanks for digging through this!

Adding an option doesn't avoid a decision about what the behavior will be. >99% of users will use the default for something as fine-grained as this.
So we need to find a default that works well for most people. And if it's good enough to be the default, it's probably good enough to be the only behavior - supporting multiple has high costs and benefits few people.

As for behavior, it seems there's a few different changes/aspects worth looking at:

Inserting trailing quotes after filenames

The advantages:

  • don't have to type the quote, or remember whether it's > or "
  • makes the file/directory distinction in the completion list obvious

The disadvantages:

  • looks funny in completion list

(I haven't included issues if the line already has more text and a closing quote here, as we have to consider this case carefully either way)

My take here is that inserting the trailing quote is somewhat useful and people quickly get used to it. What do you think?

Inserting slashes after directories

The current workflow (with the slash inserted) is:

  1. start typing directory name
  2. select completion item
  3. hit enter or similar
  4. start typing filename, or hit ctrl-space etc

The proposed workflow is:

  1. start typing directory name
  2. select completion
  3. hit slash

This is certainly a better workflow, especially you don't know the first character of the next path segment (well, at least for US keyboard layout users where slash is easy to reach).
My main concerns are a) may not be available in all editors, b) users may not know about it and use it.

  1. start typing directory name
  2. select completion
  3. hit enter or similar
  4. hit slash

which is pretty clumsy. One option would be to check whether the client has commitCharacter support. If so, omit the slash, and send slash as a commitCharacter. Otherwise include the slash... This obviously solves the editor but not the user problem.

Handling the rest of the line

We may not be completing at the end of the line. This happens if the user completes in the middle of the file path, or the editor has inserted a matching quote, etc.

Now we need to decide how much of the line we're going to replace. Various API contracts mean this should be the same for every completion item in a given context. We can re-insert text that already exists in the replaced range (e.g. closing quotes), but the cursor will be placed on the right of it.

Let's look at #include "a/b^c" where c might be empty. (There are other cases, but I think this is the most important).
Our main options are:

  1. replace b, c, and " with the completion (and quote)
  2. replace b and c with the completion
  3. replace b with the completion

Option 3 is dominated by option 2 I think - it tends to produce nonsense tokens. So only 1 and 2 are worth considering.

Advantages of replacing the closing quote:

  • is consistent with inserting closing quotes elsewhere, which seems like useful behavior elsewhere
  • puts us at the end of the line after the completion, ready to hit enter

Disadvantages of replacing the closing quote:

  • less jarring when completing a directory, as it doesn't eat the closing quote

Are there others?
This one seems like a wash to me, and I'm comfortable with the current behavior. In particular I don't see the conflict between inserting quotes and editors auto-pairing quotes that you mention. Can you elaborate?


I don't want to jump to a conclusion, but I would at least figure that the most important part of this change/issue with current behavior is the workflow around completing a directory name. Is that right?

I think we should first think about what are participating in a modularised development environment. In terms of code completion, there are usually four layers:

  • a "backend" which knows everything about the language, e.g. ClangSema
  • a "frontend" which follows some well-known protocol(s), e.g. clangd
  • an adapter which deals with all kinds of quirks of the frontend and editor and ensures everyone's happy, e.g. vscode-clangd or some magic configuration files for vim lsp plugins
  • an editor which the user is accustomed to but knows little about the language, e.g. vscode or vim with some lsp plugin

Now, the completion behaviours we care about.

the default behaviour
I think ClangSema, as a language backend library, targets more about frontend developers rather than end users, and should provide rich but "pure" information to ensure flexibility. It doesn't seem necessary for it the decide which behaviour is more advantageous, especially for some end user oriented behaviours. In this specific case, I suppose it is more reasonable to either provide an option or return the raw info and let the upper levels to decide what to do.

the slashes
Different editors, under different user settings, have differences in completion behaviour, which are not always distinguishable from LSP's ClientCapabilities, e.g.:

  • whether some key needs to be hit to commit an completion: If hitting Enter (or something else) is mandatory and commitCharacters are not supported, it should be better for clangd to include the ending slashes in the completion results; but if committing is implicit, it seems more reasonable to let the user hit / to trigger another completion, and users of these kinds of editors should be unlikely to hit Enter after selecting a completion item
  • is it easy to forcibly trigger a completion: manually triggering completion is uncommon and most editors would bind it to some multi-key shortcuts, which makes the completion less fluent, or the user will need to <Backspace>->/ if another completion is needed immediately after a directory completion

I think clangd should choose a behaviour suitable for the most usual circumstance, while trying to support more cases via options (or init params) and leaving the decision to the adapters.

completion with trailing stuff
When the suffix of a completion item matches with the prefix of the content following the completion point, things get complicated. The backend and language server are able to decide the range that should be replaced, and everything works perfectly as long as the editor correctly support textEdit. But an editor without textEdit support will turn to insertText, and it has to decide whether the "duplication" should be eliminated, which is impossible for a language-natural editor to implement accurately. The editor will either keep them in all cases, or merge them in all cases, producing something like #include "dir/^.h" -> #include "dir/foo.h".h" or ret = Func1(ns::^); -> ret = Func1(ns::Func2();. This is indeed a tricky issue.