Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
This is working now. However, the handshake is actually deferred for
ProjectAwareIndex until the first request. I tried to invoke it via something
like Opts.StaticIdx->estimateMemoryUsage() in ClangdMain.cpp but the
problem is getIndex() will return nullptr until the Config::current is
ready. I'm not entirely sure when this happens so I can't promote handshake to
some sensible time point yet.
Haven't checked the details but is there a specific reason for implementing a custom protocol rather than making use of NotifyOnStateChange (https://grpc.github.io/grpc/cpp/classgrpc_1_1_channel_interface.html) or even WaitForStateChange if we really want to block the channel creation ?
Also if there's a need to make this an RPC request, there's the default HealthCheck request we can send, as described in https://github.com/grpc/grpc/blob/master/doc/health-checking.md. This is already enabled for remote-index-server in https://github.com/llvm/llvm-project/blob/master/clang-tools-extra/clangd/index/remote/server/Server.cpp#L310, but I suppose registering a callback for statechange and logging that should be enough for a "debugging" use case (not sure what the motivation currently is).
It'd be nice to know what problem this is trying to solve :-)
I can guess, but don't feel entirely sure here.
Right, this is usually the way to go: we should definitely log NotifyOnStateChange events (note that the connection can go between available/unavailable many times over the life of the program). WaitForStateChange would be useful for blocking, but in a program where we want to proceed even if the connection is down (which is AFAIK the plan here) it's usually best not to block, and just handle fast-failure because the channel is down the same way as any other RPC error.
(I can imagine scenarios where we'd want to notify the user with showMessage or so whether the index is connected, but this requires more design)
clang-tools-extra/clangd/index/remote/Client.cpp | ||
---|---|---|
177 | This causes getClient to become blocking. Granted, we expect this to be used from ProjectAwareIndex which currently places it on a separate thread anyway. But in general, what's the advantage of blocking here? | |
181 | this means if we're offline or so when clangd starts up, the remote index will not work until restarting clangd. Do we want this behavior? | |
clang-tools-extra/clangd/index/remote/Index.proto | ||
14 ↗ | (On Diff #307943) | I find this protocol confusing, e.g. the "checksum" isn't actually a checksum of anything. |
Yes, the idea here is to provide logging for gRPC server state. My original thought was that default HealthCheck might not be enough and we might want to add more things in the mechanism but I guess it's not obvious what these extensions might be and getting grpc_connectivity_state might be enough.
Thanks! Hooking into the native status stuff seems much nicer!
Sorry if it seems like we're going in circles here, I think we should be careful about adding background threads so want to look at an alternative too.
clang-tools-extra/clangd/index/remote/Client.cpp | ||
---|---|---|
26 | this seems pedantic, but we really shouldn't specialize templates we don't own for types we don't own (ODR concerns) I'd just make this function (state -> StringRef) and call it | |
32 | any reason not to make these a bit more human-readable? connecting/ready/transient failure/shutdown? | |
122 | The thread is a bit surprising here. I was assuming we'd use some callback-based API to watch rather than spawn a thread to do the "work" of blocking on a change. Ah, so there's no callback-type connection-changed API! Sorry to mislead you :-( I hadn't read the API carefully. (I think this is coming as some point - they're talking about replacing CompletionQueue with EventManager which is a callback-based abstraction) Seems the "async" option is to subscribe on a CompletionQueue, but then we have to poll it. We can use a zero timeout so the poll doesn't block... But then we might as well poll GetStatus directly. Then the question is where to do that from. What if we called GetStatus before/after each request, and reported whenever it has changed? That means:
| |
131 | This is spinning in the background keeping the channel alive (true)... I'm actually not sure that's advisable. If the connection is being used, it'll stay alive in the same way. So true here affects idle (and init) behavior. Specifically, we maintain the TCP connection for the idle period to reduce latency of the first couple of requests after waking up (reestablish connection). I'm not sure this is a good trade, idle could easily be long and I don't think we know better than the OS when to drop a connection. (Cost might be battery life, background CPU load... IDK but probably something) I think we should ask the connection to establish on startup (*high* chance we need it soon) but otherwise only in response to requests | |
132 | Can we include the remote address and prior status? e.g. "Remote index ({0}): {1} => {2}" |
Resolve the comments.
clang-tools-extra/clangd/index/remote/Client.cpp | ||
---|---|---|
122 | ("resolved" in this version) I can understand your point and I support the arguments but this would mean that we only get status during requests and can't query it at arbitrary time which seems like a loss. Maybe it's OK because there might not be many interesting cases when we want to query the server status but not do any requests and your proposal makes it easier to implement & maintain. But also maybe it could be nice to have some kind of callback on status change. As you mentioned, I was looking for something like this in gRPC itself but couldn't find it there outside the CompletionQueue and EvenManager things. Sadly, I don't think we'd be in a position to use the replacement even when it's there since we have to support older gRPC versions :( When I see the logs and the server connectivity updates are following the requests, it looks OK but I'm not entirely sure giving up on _watching_ the status is great. But it still makes life easier so maybe this change should be about it. | |
132 | I can see how it'd be nice but I'm not sure how to do this properly. My only idea is to query prior status in the beginning of the request? Otherwise it needs to be stored somewhere and gRPC interfaces require const functions, so IndexClient does not change the state within requests. I could either wrap it into something or have global grpc_connnectivity_state for the last observed status but neither look OK. Maybe querying the prior status right before the request is OK (that's what I've done here) but I'm not sure if that's exactly what you suggested. |
clang-tools-extra/clangd/index/remote/Client.cpp | ||
---|---|---|
106 | we might want to put the server address at the end in [], mirroring D92181 | |
122 | I agree, it's absolutely a loss vs watching the state continuously. In principle we could #ifdef based on GRPC version in the future, in practice you're probably right that we just suffer without this :-( | |
132 | TL;DR: mutable atomic Using a mutable member is the usual solution to "I want to be stateful but have to adhere to a const interface. With mutable you usually have to add a mutex to provide the usual thread-safety guarantees (const = threadsafe). However in this case we just want to detect transitions, and atomics are great at this: auto NewStatus = getStatus(); auto OldStatus = last_status_.exchange(NewStatus); if (NewStatus != OldStatus) log(...); This is guaranteed to log each time last_status_ changes value. (I think this will correspond closely enough to actual status changes, though calling using a mutex and calling getStatus() under the mutex would be a little more accurate) Note that there's no difference between what you want to do before vs after an RPC - both are just chances to detect a transition - so pull out a function and call it in both places. (I'm fairly sure atomic<grpc_connectivity_state> should be directly usable, on any decent impl this should be the same implementation as atomic<underlying_type<grpc_connectivity_state>>) |
LG with one fix around the atomic
clang-tools-extra/clangd/index/remote/Client.cpp | ||
---|---|---|
49 | you access ConnectionStatus twice, and have a race between them. auto OldStatus = ConnectionStatus.exchange(NewStatus); if (OldStatus != NewStatus) log(...) |
clang-tidy: error: 'Service.grpc.pb.h' file not found [clang-diagnostic-error]
not useful