Page MenuHomePhabricator

[clangd] Initial cancellation mechanism for LSP requests.
ClosedPublic

Authored by kadircet on Aug 9 2018, 3:48 AM.

Diff Detail

Repository
rL LLVM

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
ilya-biryukov added inline comments.Aug 16 2018, 3:01 AM
clangd/Cancellation.h
86 ↗(On Diff #160652)

NIT: Maybe replace 'Totally thread-safe' with just Thread-safe? The latter conveys the same information.

90 ↗(On Diff #160652)

NIT: maybe use Token->load() instead of static_cast<bool>(Token)? Arguably, looks more concise.

92 ↗(On Diff #160652)

It's a bit confusing that this name clashes with a member function.
We seem to optimize for making this work anywhere:

if (isCancelled())  // <-- get token from ctx, then use implicit conversion to check if it was cancelled
  return llvm::makeError<CancelledError>();

Which is probably a nice pattern. But we could have two functions that do the same thing with less magic (i.e. implicit conversions) involved at the call sites:

CancellationToken getCurrentCancellation();
void setCurrentCancellation(CancellationToken CT);

inline bool isCancelled() { return getCurrentCancellation().isCancelled(); }

Would allow to get rid of implicit conversion and friend function with the same signature as member. Would also make cases where we actually want to stash the token somewhere cleaner, e.g. instead of CancellationToken CT = isCancelled(), we'll have CancellationToken CT = getCurrentCancellation().
WDYT?

104 ↗(On Diff #160652)

As discussed offline, maybe we should make TaskHandle move-only?
Conceptually, TaskHandle is an abstraction that represent some spawned async task and in that sense somewhat similar to futures.

108 ↗(On Diff #160652)

I wonder if we should have a createCancellationToken() method and a method to stash it in the context instead?
Would help if we want to make TaskHandle move-only and in general seems like a simpler user API, because things you put and get back out of the context is the same, i.e. it's a CancellationToken.
WDYT?

clangd/ClangdLSPServer.cpp
74 ↗(On Diff #160652)

NIT: return std::string, const does not buys us anything here.

354 ↗(On Diff #160652)

NIT: remove braces around short, single-statement branches

631 ↗(On Diff #160652)

Use a value type here, i.e. std::string instead of const std::string&.

642 ↗(On Diff #160652)

Use a value type here, i.e. std::string instead of const std::string&.

643 ↗(On Diff #160652)

NIT: remove these braces, they don't buy us anything here?

clangd/ClangdLSPServer.h
178 ↗(On Diff #160652)

I'm not sure what associated with only their thread means.
The tokens are associated with the request-ids rather than threads, so having something like ..., they create and remove cancellation handles for the request-id stored in the current Context

179 ↗(On Diff #160652)

These two methods look like a nice suit for better wrapped in a RAII-object. It would add the value to the associated map on construction and remove on destruction.
One problem is that all operations calls would become weird, because we'll have to move this RAII object around and moving into lambdas is not fun without C++14 (which allows [x=std::move(y)]() {}..

Maybe add a comment that calls to this function should always be paired and a FIXME that we should have an API that enforces this?

clangd/ClangdServer.cpp
167 ↗(On Diff #160652)

NIT: remove braces around single-statement branches

clangd/JSONRPCDispatcher.cpp
135 ↗(On Diff #160652)

Isn't there a form of handleErrors that can handle the second case in the last argument?
I.e. can we merge the following if into the handleErrors call?

clangd/JSONRPCDispatcher.h
67 ↗(On Diff #160652)

Maybe adds docs explaining what this function does?
Specifically, what ErrorCode it chooses and how the message would look like

115 ↗(On Diff #160652)

naming-nit: use lowerCamelCase for function names

115 ↗(On Diff #160652)

I suggest we document this function and put to next to reply and replyError, since they are closely related (i.e. the latter put the current request-id into the replies).
We should probably discourage using this function in its documentation unless absolutely necessary, the users should probably directly call reply or replyError instead.

unittests/clangd/CancellationTests.cpp
64 ↗(On Diff #160652)

Another interesting case to test:
create multiple threads, propagate cancellation into them and check they get properly cancelled.
Using Notification from Threading.h should allow to write it in a pretty simplistic manner.

The last set of comments is for the previous version, might be missing some updates, sorry about that. Will make sure to review the one too!

kadircet updated this revision to Diff 161018.Aug 16 2018, 6:36 AM
kadircet marked 21 inline comments as done.
  • Resolve discussions.
kadircet added inline comments.Aug 16 2018, 6:36 AM
clangd/Cancellation.h
92 ↗(On Diff #160652)

Totally agreed, actually, didn't even notice they had the same signature :D, thanks!
But I still think implicit conversion of CancellationToken to bool seems like a good thing to have.

clangd/ClangdLSPServer.h
179 ↗(On Diff #160652)

Yeah that would be really nice to have them in a RAII object, but the thing is cleanup can occur only after the task dies. Which we don't directly know since threads are being detached. So the only way to trigger destruction is at the callback, which is where I call cleanup currently. Maybe we can store the object within context to trigger destructor at the end of the thread, but still don't think it would look any better or ease the job of the caller.

Mostly NITs, but please take a look at the CancelParams handling problem. I believe there might be a potential bug hiding there :-)

clangd/Cancellation.h
87 ↗(On Diff #161018)

s/Can be created with clangd::isCancelled()/Tokens for the currently running task can be obtained via clangd::getCurrentCancellationToken/?
(to focus on which kind of tokens can be obtained from the tokens)

Another idea: maybe add a reference no why someone would want to use CancellationToken instead of isCancelled() helper, e.g.

/// If cancellation checks are rare, one could use the isCancelled() helper to simplify the code.
/// However, if cancellation checks are frequent, the guideline is first obtain the CancellationToken for the currently running task with getCurrentCancellationToken() and do cancel checks using it to avoid extra lookups in the Context
137 ↗(On Diff #161018)

s/got/was?

clangd/ClangdLSPServer.cpp
621 ↗(On Diff #161018)

Wouldn't it work incorrectly for string IDs?
When normalizing json::Value from getRequestID, we simply print out json. For strings, this should result in quoted output, e.g. "req". However, when parsing CancelParams, we parse this json. So we'll end up inserting with a key "req" and erasing with a key 'req' (without the quotes).

643 ↗(On Diff #161018)

Could we elog if insert indicated that a value already exists in a map? This would probably mean malformed user input, but could also be immensely helpful for debugging.

clangd/ClangdLSPServer.h
179 ↗(On Diff #160652)

Having these paired method calls is probably fine for the first iteration, but we should abstract away this pattern when supporting cancellations for more operations.
LG with a FIXME.

clangd/ClangdServer.cpp
152 ↗(On Diff #161018)

Maybe use a shorter name? E.g. Task or TH :-)

153 ↗(On Diff #161018)

Instead of moving cancellation token around explicitly, we could stash in a context right away and it would nicely propagate into the thread. This looks like less boilerplate.
E.g.

auto CancellableTaskHandle = TaskHandle::create();
WithContext ContextWithCancellation(setCurrentCancellationToken(TaskHandle.createCancellationToken()));

auto Task = [](File, Callback<CodeCompleteResult> CB, llvm::Expected<InputsAndPreamble> IP) {
  if (getCancellationToken().isCancelled())
    return CB(make_error<CancelledError>());
 };
 /// rest of the code is the same
178 ↗(On Diff #161018)

Thinking out loud: the TaskHandle and CancellationToken handling will probably eventually move into TUScheduler::runWithPreamble and other thread-spawning code to avoid this kind of boilerplate here.

No need to change anything in this patch, though.

clangd/Protocol.h
871 ↗(On Diff #161018)

Maybe add a comment from the LSP spec to be consistent with the rest of the file?
Also note that this can be either a number or a string, but we choose to print out a number in that case and just always use strings.

kadircet updated this revision to Diff 161261.Aug 17 2018, 8:26 AM
kadircet marked 10 inline comments as done.
  • Resolve discussions & Delete enclosing quotes when normalizing.
kadircet marked an inline comment as done.Aug 17 2018, 8:27 AM
kadircet added inline comments.
clangd/ClangdLSPServer.cpp
621 ↗(On Diff #161018)

You are right, that's exactly the case, thanks!

kadircet marked 2 inline comments as done.Aug 17 2018, 8:28 AM
ilya-biryukov added inline comments.Aug 17 2018, 8:59 AM
clangd/ClangdLSPServer.cpp
80 ↗(On Diff #161261)

This still misses string escaping issues. E.g. " will get quoted as \", etc.
I suggest we actually switch on this json value kind.
Better yet, reuse the part of fromJSON(CancelParams) that does the request-id parsing.

clangd/ClangdServer.cpp
166 ↗(On Diff #161261)

NIT: check for it even before looking at the preamble?

kadircet updated this revision to Diff 161428.Aug 20 2018, 1:00 AM
kadircet marked 2 inline comments as done.
  • Check cancellation earlier && Fix normalization difference between string and integer request ids.
kadircet marked an inline comment as done.Aug 20 2018, 1:03 AM
kadircet added reviewers: ioeric, hokein.
ioeric added inline comments.Aug 20 2018, 5:14 AM
clangd/Cancellation.h
96 ↗(On Diff #161428)

As chatted offline, I have questions about the motivation of the CancellationToken class. Intuitively, it seems to me that this can be merged into TaskHandle, and we can simply stash the TaskHandle instead of a token into the context. There would be fewer states to maintain, and the design would be a bit more straightforward. I might be missing context/concerns here, so I'd like to hear what you and Ilya think. @ilya-biryukov

clangd/ClangdLSPServer.cpp
75 ↗(On Diff #161428)

Use ID.getAsString()?

kadircet added inline comments.Aug 20 2018, 5:52 AM
clangd/ClangdLSPServer.cpp
75 ↗(On Diff #161428)

Unfortunately id can be both a string or a number according to LSP specs therefore we normalize both cases into a string.

ilya-biryukov added inline comments.Aug 21 2018, 1:17 AM
clangd/Cancellation.h
96 ↗(On Diff #161428)

I am for splitting cancellation checks from cancellation triggers.
The processing code only needs to check if it was cancelled and exposing the cancel() does not add any useful functionality, merely ways to misuse it.

ioeric added inline comments.Aug 21 2018, 1:20 AM
clangd/Cancellation.h
96 ↗(On Diff #161428)

Couldn't we prevent this by passing only const handle references to users who are not expect to call cancel()?

ilya-biryukov added inline comments.Aug 21 2018, 3:31 AM
clangd/Cancellation.h
96 ↗(On Diff #161428)

As long as TaskHandle does not have a copy constructor, that LG.
I.e. the scheme that seems to work is:

  1. return shared_ptr<TaskHandle> from async computations like code complete.
  2. stash shared_ptr<const TaskHandle> into the context.
  3. return const TaskHandle* from the context.

This should give a simpler implementation.
The cons from my point of view are:

  1. We return copyable task handles (i.e. the shared_ptr) from the API by default. This provokes usages that don't take ownership issues into account, i.e. it's easy to just copy the task everywhere without having clear ownership of it. As long as it's just cancellation there, we're probably fine, but if we add more stuff into it that might become a problem.
  2. We expose implementation details (i.e. shared_ptr) by removing the layer of abstraction. This makes potential refactorings of the client code harder.

In general, I think having our own wrappers there opens up more room for experiments with the API later (the API is very constrained, so it's easy to refactor the implementation). OTOH, if we believe the design is right or changing the clients is not too much work (which is probably true), either way is fine with me.

But either solution has its pros and cons, I'm happy with trying out any of those on code completion, see how it goes and decide whether we want to change anything before finalizing our design and adding cancellation to all operations in clangd.

kadircet updated this revision to Diff 161716.Aug 21 2018, 7:33 AM
kadircet marked 4 inline comments as done.
  • Get rid of CancellationToken && Resolve discussions.

LG, just a few last nits

clangd/Cancellation.cpp
30 ↗(On Diff #161716)

NIT: maybe consider inlining it? Since Task has a public default constructor, I don't think having this method buys us anything

clangd/Cancellation.h
22 ↗(On Diff #161716)

s/Taskr/Task

23 ↗(On Diff #161716)

remove mentions of CancellationToken

49 ↗(On Diff #161716)

s/CancellationToken/getCurrentTask().isCancelled()

clangd/ClangdLSPServer.cpp
76 ↗(On Diff #161716)

Maybe extract the relevant code into a helper to avoid creating the unrelated class (CancelParams)?

621 ↗(On Diff #161716)

Invert condition to reduce nesting?
See LLVM Style Guide

kadircet updated this revision to Diff 161908.Aug 22 2018, 2:32 AM
kadircet marked 6 inline comments as done.
  • Resolve discussions.

Mostly NITs, but please take a look at the potential data race

clangd/Cancellation.cpp
23 ↗(On Diff #161908)

Maybe add 'assert' that TaskKey is present?
To make sure we see the breakages as early as possible when asserts are enabled (otherwise it's undefined behavior and it can propagate to breakages in client code)

clangd/Cancellation.h
27 ↗(On Diff #161908)

s/propagete/propagate

114 ↗(On Diff #161908)

Maybe add a comment that the current Context must be have the task stashed in it?

clangd/ClangdLSPServer.cpp
351 ↗(On Diff #161908)

CleanupTaskHandle() can run in a separate thread, so can potentially run before the StoreTaskHandle call.
To avoid memory leaks in that case, let's preallocate the entry before calling codeComplete

621 ↗(On Diff #161908)

s/it/It

clangd/Protocol.cpp
629 ↗(On Diff #161908)

Maybe switch on different kinds of json::Value instead of using the ObjectMapper?
The code should be simpler. Albeit the fromJson of CancelParams might get a little more complicated, but that looks like a small price to pay.

clangd/Protocol.h
883 ↗(On Diff #161908)

Maybe simplify the signature to: Optional<string> parseNumberOrString(const llvm::json::Value&);
And call as parseNumberOrString(obj["id"]) instead of parseNumberOrString(obj, "id")?

Also, maybe call it normalizeRequestId?

kadircet updated this revision to Diff 162342.Aug 24 2018, 1:49 AM
kadircet marked 3 inline comments as done.
  • Change a few comments.
  • Add some assertions.
  • Fix a data race.
kadircet marked 2 inline comments as done.Aug 24 2018, 1:50 AM
kadircet added inline comments.
clangd/ClangdLSPServer.cpp
351 ↗(On Diff #161908)

Thanks!

kadircet updated this revision to Diff 162354.Aug 24 2018, 2:48 AM
kadircet marked 2 inline comments as done.
  • Change helper for normalizing Number | String.
clangd/Protocol.h
883 ↗(On Diff #161908)

LSP has a few more places that can have paremeters of type number | string so I believe this one could be used with them as well. Therefore I think it is better to keep the name that way. The normalizeRequestID is a sepcial case and handled within ClangdLSPServer.

ilya-biryukov accepted this revision.Aug 24 2018, 5:40 AM

LG with a few small comments

clangd/Protocol.cpp
635 ↗(On Diff #162354)

Maybe use itostr?

641 ↗(On Diff #162354)

What is Params is not an object?

This revision is now accepted and ready to land.Aug 24 2018, 5:40 AM
kadircet updated this revision to Diff 162365.Aug 24 2018, 5:48 AM
kadircet marked 2 inline comments as done.
  • Resolve comments.
This revision was automatically updated to reflect the committed changes.