This is an archive of the discontinued LLVM Phabricator instance.

[clangd] Initial cancellation mechanism for LSP requests.
ClosedPublic

Authored by kadircet on Aug 9 2018, 3:48 AM.

Event Timeline

kadircet created this revision.Aug 9 2018, 3:48 AM

I have left a few comments, but wanted to start with a higher-level design consideration.

I believe we should probably expose the cancellations in the ClangdServer API directly.
The reasons for that are:

  1. We have an internal client that would also benefit from using the cancellation API and we should try to share as much code as possible between that and the rest of clangd. Creating the cancellation cookies, stashing them in the context are all details that could be handled by ClangdServer so two clients don't have to do the same things.
  2. ClangdServer seems like a natural layer for adding this, since it's a user-facing async API. It spawns async tasks, so it is a perfect place to provide a way to cancel the spawned tasks as well.

The concrete API that I propose is:

  • Relevant methods of ClangdServer (we could start we code completion) that spawn async reqeuests returns a cancellation object that allows to cancel the spawned job:
class TaskHandle {
  void cancel();
};

class ClangdServer {
  TaskHandle codeComplete(File, Position, Callback);
};
  • Users of the API are free to ignore the TaskHandle if they don't need to cancel the corresponding requests.
  • If the users are capable of cancelling the request, they are responsible for keeping the TaskHandle around before the corresponding callback is called.
  • If the request was cancelled, the Callback might receive the TaskCancelledError in the callback and should handle it accordingly.

The rest of the design LG, i.e. the code that can actually be cancelled is responsible for checking CancellationHandler::HasCancelled and should return a corresponding error in the callback if it was cancelled.

clangd/Cancellation.h
2

This file could use some comments about the API and samples of how it can be used.
Could probably be done after we finalize the design, though.

23

I think we might want to expose the context key for the cancellation primitive here.
The reason is that checking if request is cancelled is a common operation (i.e. I can imagine it being called on every few parsed ast node, etc.), so we'd want to keep the overhead of checking for cancellation very low.

Exposing a key in the context would allow to avoid doing the lookups in the context every time we need to check if request was cancelled.

Note that exposing shared_ptr<atomic<bool>> is probably not a great idea, since it'll allow to write into that object too. I suggest we go with a class that wraps it and only allows to read the value of the variable, i.e. only has isCancelled method.

24

The LLVM_NODISCARD is misplaced. The current code applies the attribute to the type (i.e. Context), and we want to apply to the function.
There are two ways to do that:

  1. LLVM_NODISCARD static Context SetCurrentCancellationToken()
  2. static Context SetCurrentCancellationToken LLVM_NODISCARD ()

The first one is definitely less awkward and more widely used, so I suggest we stick to it

26

NIT: use lowerCamelCase for function names (per LLVM style guide)

37

It seems the contract for ErrorInfo actually requires this function to be implemented, so using llvm_unreachable doesn't seem right here.
We could get away without actually unique error codes by returning llvm::inconvertibleErrorCode(), though.

clangd/ClangdLSPServer.cpp
76
  1. Maybe use getAsInteger() to convert directly to int64_t?
  2. Maybe use itostr to avoid conversion from int64 to uint64, which is undefined for negative numbers?
  3. What if input is neither string nor number? Maybe add an assertion and check at the call-sites? That's a form of user input and we don't want clangd to crash in case of incorrect ones.

Alternatively, could we simply pipe the json::Value to a raw_string_stream and get the pretty-printed string? Would avoid dealing with input validation issues.

clangd/ClangdLSPServer.h
172

Maybe clarify in the comment that the key of the map is a serialized request-id?

kadircet updated this revision to Diff 160531.Aug 14 2018, 1:36 AM
kadircet marked 7 inline comments as done.
  • Polished API.
  • Resolved discussions.

Sorry for missing this one. The biggest concern that I have is about the thread-safety of the API, it's too easy to misuse from multiple threads at this point.

We should define thread-safety guarantees for TaskHandle and CancellationToken and make sure they're not easily broken. The suggested way is to make both move-only types.
For CancellationToken we might need a non-thread-safe copy operation, though. But maybe make it a separate method (clone?) with a clear statement that it's not thread-safe?

Other than that just NITs.

clangd/Cancellation.cpp
17

Having a shared_ptr key in the Context can cause data races (e.g. if we copy it concurrently from multiple threads).
I suggest we make CancellationToken move-only (i.e. disallow copies) and return const CancellationToken& when getting it from the context.

34

I'm not sure if this function buys us much in terms of clarity, maybe remove it and inline everywhere?

clangd/Cancellation.h
9

The comments seems to mention all the important bits, but maybe we could separate different concerns more clearly based on how the code should be used?
E.g. here's a rough draft to illustrate the idea:

Cancellation mechanism for async tasks. Roughly all the clients of this code can be classified into three categories:
1. The code that creates and schedules async tasks, e.g. TUScheduler.
2. The callers of the async method that can cancel some of the running tasks, e.g. `ClangdLSPServer`
3. The code running inside the async task itself, i.e. code completion or find definition implementation that run clang, etc.

For (1), the guideline is to accept a callback for the result of async operation and return a `TaskHandle` to allow cancelling the request.
    TaskHandle someAsyncMethod(function<void(llvm::Expected<ResultType>)> Callback);

The callers of async methods (2) can issue cancellations and should be prepared to handle `TaskCancelledError` result:
    TaskHandle H = someAsyncMethod([](llvm::Expected<ResultType> R) {
        if (/* code to check for TaskCancelledError */)
           llvm::errs() << "Task got cancelled";
         else
           llvm::errs() << "The result is: " << *R;        
        } 
    });

    sleep(5);
    H.cancel();
    
The worker code itself (3) should check for cancellations using `CancellationToken` that can be retrieved via `CancellationToken::isCancelled()`.
    void codeComplete() {
      const CancellationToken& CT = CancellationToken::isCancelled();
      if (CT.isCancelled()) 
        return llvm::makeError<TaskCancelledError>();

      runSema();
      if (CT.isCancelled()) 
        return llvm::makeError<TaskCancelledError>();

      queryIndex();  
      return results;
    }
If the operation was cancelled before it could run to completion, it should propagate the TaskCancelledError as a result.
60

We're missing docs on this and the other classes. I suggest a small doc describing what they're used for, e.g. used for checking if the operation was cancelled and used to signal the operation should be cancelled, etc.

Also noting the threading-safety guarantees is a probably a good idea.

61

NIT: move members to the end of the class to be consistent with the rest of clangd.

67

Can we make this ctor private?

71

I wonder if we should make TaskHandle move-only?

The reasoning is that is can be easily used from multiple threads (since it's used by code dealing with async tasks anyway) and copying it concurrently is a data race.
On the other hand, calling cancel() is perfectly thread-safe and shouldn't cause any problems.

74

NIT: maybe use a shorter name, e.g. TaskHandle::create?

82

Maybe remove CancellationHandler and put all its functions into the clangd namespace? It's somewhat easily confused with CancellationToken.
We could declare some function friends if we need to

89

NIT: maybe use a shorter name: CancelledError?

clangd/ClangdLSPServer.cpp
355

I expect the pattern if cancelled -> reply with ErrorCode::RequestCancelled else reply with ErrorCode::InvalidParams to be pretty common when we add cancellation to more operations.

Maybe add a small helper that does that? E.g.

/// Replies to a request with an error, passing proper error codes to the LSP client.
/// Properly handles cancellations.
void replyError(Error E);
627

NIT: use Lock or L for consistencty with the rest of clangd. it's somewhat easy to misread std::lock_guard<std::mutex> _(Mut) as std::lock_guard<std::mutex> (Mut)

kadircet updated this revision to Diff 160636.Aug 14 2018, 10:37 AM
kadircet marked 4 inline comments as done.
  • Get rid of getCancellationError.
  • Add replyError helper.
kadircet added inline comments.Aug 14 2018, 10:37 AM
clangd/Cancellation.cpp
17

As talked offline, copying std::shared_ptr is thread safe.

kadircet updated this revision to Diff 160652.Aug 14 2018, 11:38 AM
kadircet marked 8 inline comments as done.
  • Resolve discussions.
  • Get rid of CancellationHandler class.
  • Change error class name.
  • Improve documentation.
kadircet added inline comments.Aug 14 2018, 2:30 PM
clangd/Cancellation.h
71

As similar to above mentioned case copying doesn't introduce any data races.

kadircet updated this revision to Diff 160986.Aug 16 2018, 2:41 AM
  • Made TaskHandle move-only. Since it is costly and most likely unnecessary to copy it other than to move it into Context.
  • Provided an explicit clone method for copying.

Mostly LSP-specific comments and comments about making TaskHandle thread-safe.

I'm also starting to wonder if the scope of this patch is too big, we could potentially split it into three separate bits:

  1. Generic cancellation API, i.e. Cancellation.h and unit-tests of the API,
  2. ClangdServer using the cancellation API.
  3. LSP-specific bits, i.e. cancelRequest etc.

Not sure if it's worth splitting the last two, but splitting out the Cancellation API in a separate change could make review simpler.
But also happy to review the whole thing and land after it's done, feel free to pick whatever suits you best.

clangd/Cancellation.cpp
17

My mistake, thanks for explaining this one to me.
Copying of CancellationTokens LG!

clangd/Cancellation.h
102

s/asyn/async

ilya-biryukov added inline comments.Aug 16 2018, 3:01 AM
clangd/Cancellation.h
87

NIT: Maybe replace 'Totally thread-safe' with just Thread-safe? The latter conveys the same information.

91

NIT: maybe use Token->load() instead of static_cast<bool>(Token)? Arguably, looks more concise.

93

It's a bit confusing that this name clashes with a member function.
We seem to optimize for making this work anywhere:

if (isCancelled())  // <-- get token from ctx, then use implicit conversion to check if it was cancelled
  return llvm::makeError<CancelledError>();

Which is probably a nice pattern. But we could have two functions that do the same thing with less magic (i.e. implicit conversions) involved at the call sites:

CancellationToken getCurrentCancellation();
void setCurrentCancellation(CancellationToken CT);

inline bool isCancelled() { return getCurrentCancellation().isCancelled(); }

Would allow to get rid of implicit conversion and friend function with the same signature as member. Would also make cases where we actually want to stash the token somewhere cleaner, e.g. instead of CancellationToken CT = isCancelled(), we'll have CancellationToken CT = getCurrentCancellation().
WDYT?

105

As discussed offline, maybe we should make TaskHandle move-only?
Conceptually, TaskHandle is an abstraction that represent some spawned async task and in that sense somewhat similar to futures.

109

I wonder if we should have a createCancellationToken() method and a method to stash it in the context instead?
Would help if we want to make TaskHandle move-only and in general seems like a simpler user API, because things you put and get back out of the context is the same, i.e. it's a CancellationToken.
WDYT?

clangd/ClangdLSPServer.cpp
74

NIT: return std::string, const does not buys us anything here.

354

NIT: remove braces around short, single-statement branches

639

Use a value type here, i.e. std::string instead of const std::string&.

650

Use a value type here, i.e. std::string instead of const std::string&.

651

NIT: remove these braces, they don't buy us anything here?

clangd/ClangdLSPServer.h
178

I'm not sure what associated with only their thread means.
The tokens are associated with the request-ids rather than threads, so having something like ..., they create and remove cancellation handles for the request-id stored in the current Context

179

These two methods look like a nice suit for better wrapped in a RAII-object. It would add the value to the associated map on construction and remove on destruction.
One problem is that all operations calls would become weird, because we'll have to move this RAII object around and moving into lambdas is not fun without C++14 (which allows [x=std::move(y)]() {}..

Maybe add a comment that calls to this function should always be paired and a FIXME that we should have an API that enforces this?

clangd/ClangdServer.cpp
167

NIT: remove braces around single-statement branches

clangd/JSONRPCDispatcher.cpp
134

Isn't there a form of handleErrors that can handle the second case in the last argument?
I.e. can we merge the following if into the handleErrors call?

clangd/JSONRPCDispatcher.h
67

Maybe adds docs explaining what this function does?
Specifically, what ErrorCode it chooses and how the message would look like

114

naming-nit: use lowerCamelCase for function names

114

I suggest we document this function and put to next to reply and replyError, since they are closely related (i.e. the latter put the current request-id into the replies).
We should probably discourage using this function in its documentation unless absolutely necessary, the users should probably directly call reply or replyError instead.

unittests/clangd/CancellationTests.cpp
65

Another interesting case to test:
create multiple threads, propagate cancellation into them and check they get properly cancelled.
Using Notification from Threading.h should allow to write it in a pretty simplistic manner.

The last set of comments is for the previous version, might be missing some updates, sorry about that. Will make sure to review the one too!

kadircet updated this revision to Diff 161018.Aug 16 2018, 6:36 AM
kadircet marked 21 inline comments as done.
  • Resolve discussions.
kadircet added inline comments.Aug 16 2018, 6:36 AM
clangd/Cancellation.h
93

Totally agreed, actually, didn't even notice they had the same signature :D, thanks!
But I still think implicit conversion of CancellationToken to bool seems like a good thing to have.

clangd/ClangdLSPServer.h
179

Yeah that would be really nice to have them in a RAII object, but the thing is cleanup can occur only after the task dies. Which we don't directly know since threads are being detached. So the only way to trigger destruction is at the callback, which is where I call cleanup currently. Maybe we can store the object within context to trigger destructor at the end of the thread, but still don't think it would look any better or ease the job of the caller.

Mostly NITs, but please take a look at the CancelParams handling problem. I believe there might be a potential bug hiding there :-)

clangd/Cancellation.h
88

s/Can be created with clangd::isCancelled()/Tokens for the currently running task can be obtained via clangd::getCurrentCancellationToken/?
(to focus on which kind of tokens can be obtained from the tokens)

Another idea: maybe add a reference no why someone would want to use CancellationToken instead of isCancelled() helper, e.g.

/// If cancellation checks are rare, one could use the isCancelled() helper to simplify the code.
/// However, if cancellation checks are frequent, the guideline is first obtain the CancellationToken for the currently running task with getCurrentCancellationToken() and do cancel checks using it to avoid extra lookups in the Context
138

s/got/was?

clangd/ClangdLSPServer.cpp
628

Wouldn't it work incorrectly for string IDs?
When normalizing json::Value from getRequestID, we simply print out json. For strings, this should result in quoted output, e.g. "req". However, when parsing CancelParams, we parse this json. So we'll end up inserting with a key "req" and erasing with a key 'req' (without the quotes).

650

Could we elog if insert indicated that a value already exists in a map? This would probably mean malformed user input, but could also be immensely helpful for debugging.

clangd/ClangdLSPServer.h
179

Having these paired method calls is probably fine for the first iteration, but we should abstract away this pattern when supporting cancellations for more operations.
LG with a FIXME.

clangd/ClangdServer.cpp
152

Maybe use a shorter name? E.g. Task or TH :-)

153

Instead of moving cancellation token around explicitly, we could stash in a context right away and it would nicely propagate into the thread. This looks like less boilerplate.
E.g.

auto CancellableTaskHandle = TaskHandle::create();
WithContext ContextWithCancellation(setCurrentCancellationToken(TaskHandle.createCancellationToken()));

auto Task = [](File, Callback<CodeCompleteResult> CB, llvm::Expected<InputsAndPreamble> IP) {
  if (getCancellationToken().isCancelled())
    return CB(make_error<CancelledError>());
 };
 /// rest of the code is the same
180–181

Thinking out loud: the TaskHandle and CancellationToken handling will probably eventually move into TUScheduler::runWithPreamble and other thread-spawning code to avoid this kind of boilerplate here.

No need to change anything in this patch, though.

clangd/Protocol.h
865

Maybe add a comment from the LSP spec to be consistent with the rest of the file?
Also note that this can be either a number or a string, but we choose to print out a number in that case and just always use strings.

kadircet updated this revision to Diff 161261.Aug 17 2018, 8:26 AM
kadircet marked 10 inline comments as done.
  • Resolve discussions & Delete enclosing quotes when normalizing.
kadircet marked an inline comment as done.Aug 17 2018, 8:27 AM
kadircet added inline comments.
clangd/ClangdLSPServer.cpp
628

You are right, that's exactly the case, thanks!

kadircet marked 2 inline comments as done.Aug 17 2018, 8:28 AM
ilya-biryukov added inline comments.Aug 17 2018, 8:59 AM
clangd/ClangdLSPServer.cpp
80

This still misses string escaping issues. E.g. " will get quoted as \", etc.
I suggest we actually switch on this json value kind.
Better yet, reuse the part of fromJSON(CancelParams) that does the request-id parsing.

clangd/ClangdServer.cpp
164

NIT: check for it even before looking at the preamble?

kadircet updated this revision to Diff 161428.Aug 20 2018, 1:00 AM
kadircet marked 2 inline comments as done.
  • Check cancellation earlier && Fix normalization difference between string and integer request ids.
kadircet marked an inline comment as done.Aug 20 2018, 1:03 AM
kadircet added reviewers: ioeric, hokein.
ioeric added inline comments.Aug 20 2018, 5:14 AM
clangd/Cancellation.h
97

As chatted offline, I have questions about the motivation of the CancellationToken class. Intuitively, it seems to me that this can be merged into TaskHandle, and we can simply stash the TaskHandle instead of a token into the context. There would be fewer states to maintain, and the design would be a bit more straightforward. I might be missing context/concerns here, so I'd like to hear what you and Ilya think. @ilya-biryukov

clangd/ClangdLSPServer.cpp
75

Use ID.getAsString()?

kadircet added inline comments.Aug 20 2018, 5:52 AM
clangd/ClangdLSPServer.cpp
75

Unfortunately id can be both a string or a number according to LSP specs therefore we normalize both cases into a string.

ilya-biryukov added inline comments.Aug 21 2018, 1:17 AM
clangd/Cancellation.h
97

I am for splitting cancellation checks from cancellation triggers.
The processing code only needs to check if it was cancelled and exposing the cancel() does not add any useful functionality, merely ways to misuse it.

ioeric added inline comments.Aug 21 2018, 1:20 AM
clangd/Cancellation.h
97

Couldn't we prevent this by passing only const handle references to users who are not expect to call cancel()?

ilya-biryukov added inline comments.Aug 21 2018, 3:31 AM
clangd/Cancellation.h
97

As long as TaskHandle does not have a copy constructor, that LG.
I.e. the scheme that seems to work is:

  1. return shared_ptr<TaskHandle> from async computations like code complete.
  2. stash shared_ptr<const TaskHandle> into the context.
  3. return const TaskHandle* from the context.

This should give a simpler implementation.
The cons from my point of view are:

  1. We return copyable task handles (i.e. the shared_ptr) from the API by default. This provokes usages that don't take ownership issues into account, i.e. it's easy to just copy the task everywhere without having clear ownership of it. As long as it's just cancellation there, we're probably fine, but if we add more stuff into it that might become a problem.
  2. We expose implementation details (i.e. shared_ptr) by removing the layer of abstraction. This makes potential refactorings of the client code harder.

In general, I think having our own wrappers there opens up more room for experiments with the API later (the API is very constrained, so it's easy to refactor the implementation). OTOH, if we believe the design is right or changing the clients is not too much work (which is probably true), either way is fine with me.

But either solution has its pros and cons, I'm happy with trying out any of those on code completion, see how it goes and decide whether we want to change anything before finalizing our design and adding cancellation to all operations in clangd.

kadircet updated this revision to Diff 161716.Aug 21 2018, 7:33 AM
kadircet marked 4 inline comments as done.
  • Get rid of CancellationToken && Resolve discussions.

LG, just a few last nits

clangd/Cancellation.cpp
31

NIT: maybe consider inlining it? Since Task has a public default constructor, I don't think having this method buys us anything

clangd/Cancellation.h
23

s/Taskr/Task

24

remove mentions of CancellationToken

50

s/CancellationToken/getCurrentTask().isCancelled()

clangd/ClangdLSPServer.cpp
76

Maybe extract the relevant code into a helper to avoid creating the unrelated class (CancelParams)?

629

Invert condition to reduce nesting?
See LLVM Style Guide

kadircet updated this revision to Diff 161908.Aug 22 2018, 2:32 AM
kadircet marked 6 inline comments as done.
  • Resolve discussions.

Mostly NITs, but please take a look at the potential data race

clangd/Cancellation.cpp
24

Maybe add 'assert' that TaskKey is present?
To make sure we see the breakages as early as possible when asserts are enabled (otherwise it's undefined behavior and it can propagate to breakages in client code)

clangd/Cancellation.h
28

s/propagete/propagate

115

Maybe add a comment that the current Context must be have the task stashed in it?

clangd/ClangdLSPServer.cpp
351

CleanupTaskHandle() can run in a separate thread, so can potentially run before the StoreTaskHandle call.
To avoid memory leaks in that case, let's preallocate the entry before calling codeComplete

628

s/it/It

clangd/Protocol.cpp
629

Maybe switch on different kinds of json::Value instead of using the ObjectMapper?
The code should be simpler. Albeit the fromJson of CancelParams might get a little more complicated, but that looks like a small price to pay.

clangd/Protocol.h
877

Maybe simplify the signature to: Optional<string> parseNumberOrString(const llvm::json::Value&);
And call as parseNumberOrString(obj["id"]) instead of parseNumberOrString(obj, "id")?

Also, maybe call it normalizeRequestId?

kadircet updated this revision to Diff 162342.Aug 24 2018, 1:49 AM
kadircet marked 3 inline comments as done.
  • Change a few comments.
  • Add some assertions.
  • Fix a data race.
kadircet marked 2 inline comments as done.Aug 24 2018, 1:50 AM
kadircet added inline comments.
clangd/ClangdLSPServer.cpp
351

Thanks!

kadircet updated this revision to Diff 162354.Aug 24 2018, 2:48 AM
kadircet marked 2 inline comments as done.
  • Change helper for normalizing Number | String.
clangd/Protocol.h
877

LSP has a few more places that can have paremeters of type number | string so I believe this one could be used with them as well. Therefore I think it is better to keep the name that way. The normalizeRequestID is a sepcial case and handled within ClangdLSPServer.

ilya-biryukov accepted this revision.Aug 24 2018, 5:40 AM

LG with a few small comments

clangd/Protocol.cpp
635

Maybe use itostr?

641

What is Params is not an object?

This revision is now accepted and ready to land.Aug 24 2018, 5:40 AM
kadircet updated this revision to Diff 162365.Aug 24 2018, 5:48 AM
kadircet marked 2 inline comments as done.
  • Resolve comments.
This revision was automatically updated to reflect the committed changes.