This is an archive of the discontinued LLVM Phabricator instance.

[clangd] Handle clangd.applyFix server-side
ClosedPublic

Authored by malaperle on Oct 24 2017, 9:34 PM.

Details

Summary

When the user selects a fix-it (or any code action with commands), it is
possible to let the client forward the selected command to the server.
When the clangd.applyFix command is handled on the server, it can send a
workspace/applyEdit request to the client. This has the advantage that
the client doesn't explicitly have to know how to handle
clangd.applyFix. Therefore, the code to handle clangd.applyFix in the VS
Code extension (and any other Clangd client) is not required anymore.

Diff Detail

Repository
rL LLVM

Event Timeline

malaperle created this revision.Oct 24 2017, 9:34 PM
malaperle set the repository for this revision to rL LLVM.
malaperle added a project: Restricted Project.
ilya-biryukov edited edge metadata.Oct 25 2017, 2:01 AM

This patch intersects with D39057. (Specifically, LSP-specific parts related to executeCommand handling and argument parsing).
An approach taken in D39057 is somewhat more generic, allowing custom refactoring action to be run using workspace/executeCommand.
It doesn't tackle the problem of running the applyFix on the server, though and CommandArgument parsing is different.

Let's start a discussion on how to properly implement (and share the LSP handling/parsing bits between those two patches) there.

It's definitely nice to move this into clangd, rather than have every client implement it on their own.

clangd/Protocol.cpp
105 ↗(On Diff #120184)

While the standard JSON allows doubles, it seems really weird that doubles end up being used here. It is also weird to simply truncate the number to the nearest int, I would much rather expect the LSP server to fail when it sees non-integer numbers at those points.

Can't we fix Eclipse to send ints and not doubles instead?

Just as a side-note. @sammccall is working on a proper JSON parser in D39180 and this code wouldn't be necessary when it's ready.

272 ↗(On Diff #120184)

Why do we even process messages that we can't parse without errors?

We should return an error, rather than trying to silently do something unexpected (e.g. forget to apply one of the edits)

959 ↗(On Diff #120184)

I'd argue the proper place to define a list of known commands is ClangdLSPServer.h, not Protocol.h.
Let's keep Protocol.h a place that handles parsing/handling of general LSP without clangd-specific parts.

@sammccall, @bkramer , what are your thoughts on this?

malaperle added inline comments.Oct 25 2017, 6:54 AM
clangd/Protocol.cpp
105 ↗(On Diff #120184)

I can see if Eclipse needs to be fixes. It might take a lot of time though depending on whether it's deep into a third-party dependency and whether or not I can contribute a fix. But we'll see.

272 ↗(On Diff #120184)

It was logging that it failed to parse a TextEdit, so it's not completely silent. I was trying to make it more lenient, but perhaps that's not necessary. I'll remove this.

959 ↗(On Diff #120184)

I'm not sure what would be the best way to go about this. I need Protocol.cpp to know about the existence of the Clangd-specific command in order to parse (or fail to parse) the params.

sammccall edited edge metadata.Oct 25 2017, 7:32 AM

Cool!

I haven't reviewed the whole implementation (but I can if Ilya would like some help!)

And yeah, I'd like to have a JSON parser so some of this is less painful (and maybe prettyprint the lit-tests)

clangd/Protocol.cpp
105 ↗(On Diff #120184)

As far as JSON is concerned, there's one number type: 4 and 4.0 are the same number. For us to care about how it's serialized on the wire is pretty bad layering violation.

Integers >53 bit are basically non-portable in JSON (called out in the RFC), so I think it's fine to *only* parse as double.

All that said, we can and should reject non-integer numbers (just as we can reject strings with 'Q' in them), so I'd add a check for that: double should be in range and int(D) == D, else return false.

106 ↗(On Diff #120184)

casting double to int is UB if out-of-range, I think

272 ↗(On Diff #120184)

+1, I like leniency in general, but it seems dangerous here: won't subsequent edits will be applied against the wrong text if we've dropped some edits?

959 ↗(On Diff #120184)

Given the protocol says "any" here, I think our options are to *not* parse here (and just capture the JSON for later parsing), or to call ClangdLSPServer from here to do the parsing, or the approach given here.

Already this file de facto parses the subset of LSP that we actually support, so having it know enough about the extensions to parse them seems like the least-worst option to me.

I haven't reviewed the whole implementation (but I can if Ilya would like some help!)

That would be very helpful, I'll be back on vacation tomorrow and won't be of much help reviewing this. Many thanks!

PS Before I go, wanted to advocate for extracting common parts of this change and D39057 (Command and workspace/executeAction handling) into a separate review. It would probably lead to a nicer history and more focused reviews. On the other hand, it may not be worth the hassle, of course, but I still wanted to mention this approach here.

clangd/Protocol.cpp
105 ↗(On Diff #120184)

Well, forget what I said then. Given that JSON defines numbers as double, we're obliged to parse them correctly, even if I find the fact that Eclipse adds fractional part weird

+1 to @sammccall's comments on how we should properly parse numbers and do the range/integer checks.

959 ↗(On Diff #120184)

Yeah, I also agree that putting various clangd-specific CommandArguments here is ok.

On the other hand, I think it's perfectly fine if code inside Protocol.cpp properly parses even commands without looking at its names and ClangdLSPServer will validate the commands later when processing the request. (Approach taken in D39057). That way we'd move the list of clangd-specific command names into ClangdLSPServer (or even into ClangdServer or refactorings engine for other commands)

OK, a bunch more mostly-nit comments, but this looks pretty good to me.

I don't think we should wait on D39057:

  • that's stacked on top of two other patches that have nontrivial work/review to go
  • it takes a broader frameworky approach to editor commands, and its design will benefit from needing to deal with this case
  • they overlap in arg parsing, where there are differences (whether to use a full discriminated union for args, and whether parsing depends on the command) I think this patch makes robust choices

But I'll post a comment on the other review, let's wait a day or two to land in case I'm wrong.

clangd/ClangdLSPServer.cpp
82 ↗(On Diff #120184)

This could use a high-level comment describing the flow (either here or where the constant is defined, but I think here is better).
At first read, it was unclear why we're just echoing the edit back to the client. Then I saw the patch description which is really good :-)

Something like:
The flow for "apply-fix" :
1. We publish a diagnostic, including fixits
2. The user clicks on the diagnostic, the editor asks us for code actions
3. We send code actions, with the fixit embedded as context
4. The user selects the fixit, the editor asks us to apply it
5. We unwrap the changes and send them back to the editor
// 6. The editor applies the changes, and sends us a reply (but we ignore it)

86 ↗(On Diff #120184)

"Fix applied" since we know what the command is!

Ultimately, I think a better thing to do here would be send the applyEdit, wait for the response, and then reply success/failure to the original RPC.
But we're not there yet in terms of infrastructure, and I'm not sure whether it's actually important. I think it's worth a comment though.

88 ↗(On Diff #120184)

Rather than writing JSON here directly, could you add a method e.g. C.call (like C.reply)?

For now, it's fine just to note that results of calls can't be handled. In future we could do something like:

C.call("workspace/ApplyEdit", ApplyWorkspaceEditParams::unparse(ApplyEdit),
  [](Expected<ApplyEditResult> R) { ... });

and Call would associate the callback with the ID of the call so that we can actually handle the response.

96 ↗(On Diff #120184)

nit: "Unsupported command" or similar would be more specific

clangd/Protocol.cpp
959 ↗(On Diff #120184)

@ilya-biryukov: I'm not sure that approach is better.
LSP defines CommandArguments as any[], validity depends on the command name. D39057 pokes around at the argument and tries to guess what type it is, exploiting the fact that the two currently-supported arg types are easy to tell apart.

  • that gets harder if for some reason we end up with two commands that have args {"foo": []} and {"foo": {}} (contrived example, but name conflicts aren't *that* unlikely)
  • unsupported commands take totally different codepaths depending on the shape of their args

I like the approach from this patch better.

986 ↗(On Diff #120184)

So there's some unfortunate duplication/logic spread here between validating the command name and switching on it to parse the args.

I think you could group the command validation and arg parsing more naturally like this:

parse() {
  function<bool(MappingNode*)> ArgParser == nullptr;
  foreach (params) {
     if ("command") {
        Result.command = value;
        if (value == APPLY_FIX) 
         ArgParser = [&](MappingNode*) {
            ...
          };
       else if // other cases
       else // fail: unknown command
     } else if ("arguments" && ArgParser != nullptr) {
        foreach (args) 
          if !ArgParser(arg) // fail: bad arg
     }
  }
  if (Result.command.empty)  // fail: no command
}
996 ↗(On Diff #120184)

so uh, what if "arguments" appears before "command"? :-(

I don't have a good answer to this, it's one of the things I dislike about YAMLParser, and the same thing applies at the top level of our LSP parser. But leave a comment!

(If I manage to get a DOM-flavored JSON parser into LLVM, it will fix this!)

clangd/Protocol.h
386 ↗(On Diff #120184)

can you just make changes a map<string, vector<TextEdit>>, and hide the parse/unparse for individual entries in the CC file?

400 ↗(On Diff #120184)

I'd prefer to leave this out until it's used.

408 ↗(On Diff #120184)

This doesn't directly mirror the structure of the protocol, I think it's the right option but it deserves a comment.

If I understand right, the protocol says this is any[], but in practice the order doesn't matter, there's going to be 0-1 param of each type, and there will likely be overlap between the types of params used by different commands.

The obvious alternative is to make this a discriminated union of the different commands, but this seems simpler to me.

411 ↗(On Diff #120184)

nit: This is a bit too vague to explain much.
/// The command identifier, e.g. CLANGD_APPLY_FIX_COMMAND

414 ↗(On Diff #120184)

nit: just workspaceEdit?

test/clangd/execute-command.test
1 ↗(On Diff #120184)

So I don't say this often, but I'm not sure this testcase is that useful.

I don't think testing every way a message can be malformed scales well, particularly if it involves adding log messages to expose unexpected conditions.
The LSP-based lit tests tend to be pretty brittle and hard to read - having lots of negative tests multiplies this.

There are some more scalable ways we could address it:

  • use a schema-driven parser that's tested (e.g. when I use protobufs I don't need to test with unknown fields). But LSP doesn't have a schema that's easily amenable to this.
  • fuzzing: this is pretty nice to make sure we don't crash - bkramer is experimenting with this.
  • write unit-tests: this only really scales to cases we know are interesting, but is still nicer than lit tests as it can be more targeted and doesn't require logspam. A JSON lib would make this much more feasible I think.

We're not really in a position to do any of those in this patch though - honestly I think just removing this test is the best option for now. (It makes me uncomfortable though!)

test/clangd/fixits.test
18 ↗(On Diff #120184)

This test on the other hand is fantastic :-)

malaperle added inline comments.Oct 26 2017, 8:39 AM
clangd/Protocol.cpp
105 ↗(On Diff #120184)

I actually think we should just leave out this part of the patch and address the double parsing everywhere in a separate patch. I can live with Eclipse being broken for Fixits until then. Does that sound good?

test/clangd/execute-command.test
1 ↗(On Diff #120184)

The value is not so much testing the error messages but exercising the various code paths. While writing the tests I have found and fixed numerous crashes. Our team is also aiming at >= 80% code coverage (a combination of line+branch) which I admit brings sometimes tests that are more theoretical and quite unlikely. I do think that doing this testing method for everything in Protocol.cpp would not scale well and I don't think at this point we should go back and add more tests to existing handled protocols. I see this test as an interim until we have a better JSON parser in which case I think much of this logic will be simplified and the tests unnecessary. I also like the fuzzing approach from bkramer but it doesn't ensure that we cover all the cases we can deduce just by looking at the code coverage data.

sammccall added inline comments.Oct 26 2017, 9:09 AM
clangd/Protocol.cpp
105 ↗(On Diff #120184)

Sure, that sounds fine to me.

test/clangd/execute-command.test
1 ↗(On Diff #120184)

Sure, I'm not going to object to this going in, especially if we can reduce it later.
If you do have code coverage as a way of verifying that these cases get hit, do we need the log messages?

rwols added a subscriber: rwols.Oct 29 2017, 2:03 PM
malaperle updated this revision to Diff 120885.Oct 30 2017, 2:03 PM
malaperle marked 13 inline comments as done.

Address comments.

malaperle added inline comments.Oct 30 2017, 2:04 PM
clangd/ClangdLSPServer.cpp
82 ↗(On Diff #120184)

Sold!

88 ↗(On Diff #120184)

I moved the writeMessage to a new call() method but we'll probably want to revisit this because we'll need to generate proper IDs and handle replies like you mentioned.

clangd/Protocol.cpp
105 ↗(On Diff #120184)

Removed

272 ↗(On Diff #120184)

Removed support for skipping malformed edits.

959 ↗(On Diff #120184)

I kept it as is for now. I think I prefer failing early when parsing the ExecuteCommandParams and not have ClangdLSPServer also parse JSON or try to guess (possibly wrongly) which type the arguments are. It seems safer to me at the expense of Protocol.h/cpp not being pure LSP. But there is not that much consequence for now in making Protocol handle extensions and we can also refactor this without too much problem later.

996 ↗(On Diff #120184)

If arguments is set before, it will log in onCommand. I changed it so that it fails to parse instead. Not ideal but I don't think in practice this is likely to happen and with a better JSON parser we should be able to improve this. Let me know if you think this is a blocker though.
A JSON parser with DOM with be really great!!! I'm definitely keeping an eye on this.

clangd/Protocol.h
386 ↗(On Diff #120184)

there's only one uri + edits so I'll make this a pair instead.

test/clangd/execute-command.test
1 ↗(On Diff #120184)

do you mean removing the STDERR lines from the test or the actual logging in Protocol.cpp? I removed one in Protocol.cpp.

malaperle edited the summary of this revision. (Show Details)Oct 30 2017, 2:04 PM

LG, thanks!

test/clangd/execute-command.test
1 ↗(On Diff #120184)

Yeah I like removing the log entirely.
(I guess you need to remove it from the test too?)

malaperle updated this revision to Diff 121160.Nov 1 2017, 11:42 AM

Remove logging in code and tests.

malaperle added inline comments.Nov 1 2017, 11:43 AM
test/clangd/execute-command.test
1 ↗(On Diff #120184)

Alright, I removed the "Unknown command" one and removed the checks in the test.

hokein added a subscriber: hokein.Nov 2 2017, 5:37 AM
hokein added inline comments.Nov 2 2017, 6:26 AM
clangd/Protocol.h
386 ↗(On Diff #121160)

In the latest LSP (https://github.com/Microsoft/language-server-protocol/blob/master/protocol.md#workspaceedit), WorkspaceEdit has one more optional field documentChanges. I think it is worth a comment describing we don't support it currently?

386 ↗(On Diff #120184)

I agree with Sam here, the changes of WorkspaceEdit should be a map<string, vector<TextEdit>> -- because WorkspaceEdit can contain changes of multiple files.

malaperle added inline comments.Nov 2 2017, 7:20 AM
clangd/Protocol.h
386 ↗(On Diff #121160)

I had a line about it before but Sam preferred not keeping it. I don't mind either way.

386 ↗(On Diff #120184)

In the protocol:
changes?: { [uri: string]: TextEdit[]; };
Sorry, I though the left-most [ ] was to disambiguate the colons. But I just read about indexable types. I'll change this.

sammccall added inline comments.Nov 2 2017, 8:26 AM
clangd/Protocol.h
386 ↗(On Diff #121160)

Doh, sorry about this.

Re-adding a comment here saying that we support currently versioned edits seems fine. Up to you.

ioeric added a subscriber: ioeric.Nov 2 2017, 9:08 AM
malaperle updated this revision to Diff 121349.Nov 2 2017, 1:13 PM

Added comment, changed pair to map.

malaperle updated this revision to Diff 121352.Nov 2 2017, 1:17 PM

Renamed a variable

hokein accepted this revision.Nov 3 2017, 2:19 AM

Thanks for the updates, looks good. Let's check in it :)

This revision is now accepted and ready to land.Nov 3 2017, 2:19 AM
This revision was automatically updated to reflect the committed changes.

Thanks for the reviews!

Please remember to CC cfe-commits for such patches.