This is an archive of the discontinued LLVM Phabricator instance.

[clangd] C++ API for emitting file status
ClosedPublic

Authored by hokein on Nov 21 2018, 6:17 AM.

Details

Summary

Introduce clangd C++ API to emit the current status of file

Event Timeline

hokein created this revision.Nov 21 2018, 6:17 AM

The code is not polished yet, I'd like to get some high-level comments before moving further. Here is a quick vim demo.

Have you considered moving all status updates into the TUScheduler?
TUScheduler has full information about the file status changes, so it seems to most natural place to provide this kind of information.

sammccall added inline comments.Nov 22 2018, 8:32 AM
clangd/ClangdLSPServer.cpp
787 ↗(On Diff #174915)

we should have a protocol model object (in Protocol.h) for the file status updates.

I think it should *likely* it should be distinct from the FileStatus in TUScheduler - optimizing for easy update/comparison and programmatic checks vs optimizing for matching the desired protocol.

Naming is hard, I'd be tempted to take FileStatus for the protocol object and and make the TUScheduler one something more obscure like TUScheduler::TUState.

Unclear to me which one ClangdServer should expose, my *guess* is we should start by exposing the protocol object, and switch if it doesn't capture enough of the semantics that embedders need.

clangd/TUScheduler.cpp
353

So I'm slightly nervous about just emitting objects directly. The problem is if you have independent properties (e.g. "is using a fallback command" vs "are we computing diagnostics"), then it's a burden on the code here to pass them all every time and a burden on the callback code if you don't.

I think one solution (which *may* be what Ilya's suggesting) is to put a mutable FileStatus object in the ASTWorker, and mutate it in place and then invoke the callback, passing the const FileStatus&.

hokein updated this revision to Diff 175135.Nov 23 2018, 7:27 AM
hokein marked 2 inline comments as done.

Polish the code, address comments.

Thanks for the comment, the patch should be ready for review.

There is one thing I'm not certain: should we stop emitting the file status when the file is removed (similar to the behavior of diagnostics)? For example, the file is removed while the AST is building. The current behavior of the patch will emit them.

clangd/ClangdLSPServer.cpp
787 ↗(On Diff #174915)

SG. Moved FileStatus to Protocol.h.

clangd/TUScheduler.cpp
353

ASTWorker seems the right place for FileStatus, given that interesting statuses are in ASTWorker::update, I think a local FileStatus should be good enough.

ASTWorker is not sufficient to emit all interesting statuses -- at least for PreparingBuild, we invoke getCompileCommand in ClangdServer::addDocument...

hokein retitled this revision from [clangd] **Prototype**: C++ API for emitting file status to [clangd] C++ API for emitting file status.Nov 23 2018, 7:31 AM
hokein edited the summary of this revision. (Show Details)

Hi @hokein, I am just keeping up to date with changes.

clangd/ClangdServer.h
40

It would be unfortunate to have this name clashing with clang::DiagnosticsConsumer indeed.
How about something like FileEventConsumer?

clangd/TUScheduler.cpp
375

Wouldn't some different failure status like FileStatusKind::FailedBuild be appropriate here?

There is one thing I'm not certain: should we stop emitting the file status when the file is removed (similar to the behavior of diagnostics)? For example, the file is removed while the AST is building. The current behavior of the patch will emit them.

It would be reasonable to be consistent with diagnostics: stop emitting the statuses when ASTWorker was put into shutdown mode.
However, the clients will still need to handle statuses for "removed" files somehow: the notifications in the LSP are inherently racy, IIUC there's no way to serialize removeFile and status updates if we're processing the requests in a (truly) asynchronous manner.

hokein updated this revision to Diff 175277.Nov 26 2018, 8:43 AM
hokein marked 2 inline comments as done.

stop emitting file status when ASTworker is shutting down.

It would be reasonable to be consistent with diagnostics: stop emitting the statuses when ASTWorker was put into shutdown mode.

Thanks, sounds fair.

clangd/ClangdServer.h
40

yeah, I plan to rename it later (when the patch gets stable enough). FileEventConsumer is a candidate.

clangd/TUScheduler.cpp
375

The failure status is grouped into FileStatus::messages. I think messages provide enough information to users (if there are something wrong when building the file).

sammccall added inline comments.Nov 27 2018, 1:14 AM
clangd/ClangdLSPServer.cpp
787 ↗(On Diff #175277)

notifying via showMessage looks sensible at first glance as a fallback.
Later we may want to expose a richer API guarded by a client capability, and fallback to showMessage if the capability isn't advertised by the client.

Can you verify that showMessage is a statusbar-type message and not a dialog in VSCode?

787 ↗(On Diff #175277)

At the moment you've got FileStatus seralizing into the wire format of ShowMessageParams which seems at least odd.

We should define and pass ShowMessageParams here, and the transformation from FileStatus -> ShowMessageParams should be explicit (a named function or maybe better: inlined here) as it has a different semantic meaning than just toJSON.

If the plan is to expose the FileStatus object but you're not going to do so in this patch, I think it's OK to leave it in Protocol.h - though we should review its structure as if we were going to expose it.

clangd/ClangdServer.cpp
151

Can you explain how this should be implemented?
It seems like not being able to track PreparingBuild is the main weakness of putting the FileStatus in TUScheduler.

If we can't implement it, fine, but this FIXME suggests it just isn't done yet.

clangd/ClangdServer.h
40

The name/problem isn't new; can we do the rename in a separate patch? (before or after). It may have a blast radius in tests etc.

(I'm not sure about FileEventConsumer as it's easily confused with something that watches for file changes on disk. My suggestion would be CompilationWatcher or TUWatcher or something)

49

we should have PathRef File here too, for consistency.

It's redundant with FileStatus::uri but I think that field is actually a layering violation: DiagnosticsConsumer knows about file paths, not URIs.

The more I think about this, the more I think we should move FileStatus to the ClangdServer or TUScheduler layer and exclude URI. If we later want to expose the info via LSP we should define a separate struct in Protocol. WDYT?

clangd/Protocol.h
937 ↗(On Diff #175277)

"StatusKind" doesn't carry much meaning.
More subtly, it implies a hierarchy of information in FileStatus. I don't think this reflects reality: if kind = BuildingPreamble and messages = {"has fallback compile command"} then these are peers, not primary/secondary.

I'd suggest calling this FileAction instead. In this case you may consider renaming Ready -> Idle (as "ready" doesn't really tell you what the action is/isn't).

943 ↗(On Diff #175277)

"ready" also seems a little strong: if a file failed to build, it will be in this state but isn't really "ready" for anything.

947 ↗(On Diff #175277)

(I think this comment doesn't quite make sense: window/showMessage is only revelant to clients that receive JSON, and the JSON doesn't follow this structure)

957 ↗(On Diff #175277)

Some of the most important messages we need to show are warnings (failed compilation, fallback command). These deserve some visual indicator in the client.
You're also sending messages that are unimportant to users ("reusing AST").

I'd suggest either splitting these up into vector<string> messages, vector<string> warnings, or adding a severity.

957 ↗(On Diff #175277)

in fact, I'm not sure vector<string> is the right model here at all (for the implementation in TUScheduler) - what happens when you have multiple messages produced at different places?

clangd/TUScheduler.cpp
269

Why a new mutex rather than reusing ReportDiagnostics?

355

this seems like it could be a method on ASTWorker instead?

420

This looks like the wrong layer to me. Suppose you have a queue like this:

[update] [go to def] [update] [go to def]

As implemented, your status is going to be
building ready building ready
which seems misleading.

This makes me think:

  • the "ready"/"idle" state is a property of the worker thread, not any particular action
  • maybe we need another status (running action) and then a string field to tell us which action it was.
hokein updated this revision to Diff 175859.Nov 29 2018, 5:44 AM
hokein marked 10 inline comments as done.
  • address review comments
  • drop the LSP change, only focus on TUScheduler in this patch

@ilya-biryukov, I hope the current patch is not too big for you to review, happy to chat offline if you want (sam and I had a lot of discussions before he is OOO).

clangd/ClangdLSPServer.cpp
787 ↗(On Diff #175277)

On the second thought, solving the LSP is not the main goal of this patch, so I drop this part out (will send out a follow-up patch on the LSP part).

clangd/ClangdServer.cpp
151

Given the current code, we can't implement it, but I do think this status is valuable (especially for some build system that might be slow to prepare the building environment). Rephased the FIXME, and removed this state in the struct definition (to avoid misleading).

clangd/ClangdServer.h
49

Done, use File in the interface for consistency.

The more I think about this, the more I think we should move FileStatus to the ClangdServer or TUScheduler layer and exclude URI. If we later want to expose the info via LSP we should define a separate struct in Protocol. WDYT?

That sounds good. The FileStatus aims to be the LSP structure in this patch. I agree we should have a dedicate structure which represents the internal status of TU/file.

clangd/TUScheduler.cpp
420

A sensible status for [update] [go to def] would be queued buildingPreamble buildingFile RunningAction(GoToDef)

@ilya-biryukov, I hope the current patch is not too big for you to review, happy to chat offline if you want (sam and I had a lot of discussions before he is OOO).

Picking it up. Mostly NITs from my side and a few questions to better understand the design. Also happy to chat offline.

clangd/ClangdServer.h
40

ClangdServerCallbacks? xD

49

Have we thought about the way we might expose statuses via the LSP?
The TUStatus seems a bit too clangd-specific (i.e. BuildingPreamble, ReuseAST is not something that makes sense in the protocol). Which might be fine on the ClangdServer layer, but it feels like we should generalize before sending it over via the LSP

clangd/TUScheduler.cpp
209

Naming: emitTUStatus?

242

Maybe add a GUARDED_BY(DiagMu) comment?
Also maybe put closer to the mutex itself, e.g. right before ReportDiagnostics.

585

Code style: remove braces around a single-statement if branch

639

NIT: remove braces?

clangd/TUScheduler.h
60

What's the fundamental difference between BuildingFile and RunningAction?
We will often rebuild ASTs while running various actions that read the preamble.

Maybe we could squash those two together? One can view diagnostics as an action on the AST, similar to a direct LSP request like findReferences.

64

Maybe set default value to avoid unexpected undefined behavior in case someone forget to initialize the field?

73

NIT: maybe clarify: indicates whether clang failed... or similar?

74

Naming: should this be BuildFailed?

75

NIT: should this be reused?

76

ReuseAST?

unittests/clangd/TUSchedulerTests.cpp
724

It seems we could make this test deterministic if we:

  1. create a Notification
  2. make it ready after calling removeDocument.
  3. wait for it to become ready at the point where we have sleep_for now.
hokein updated this revision to Diff 176098.Nov 30 2018, 5:25 AM
hokein marked 11 inline comments as done.

Fix nits.

hokein added inline comments.Nov 30 2018, 5:26 AM
clangd/ClangdServer.h
49

The TUStatus seems a bit too clangd-specific (i.e. BuildingPreamble, ReuseAST is not something that makes sense in the protocol).

Yes, this is by design. TUStatus presents the internal states of clangd, and would not be exposed via LSP.

Some ways of exposing status via the LSP

  • define a separate structure e.g. FileStatus in LSP, render TUStatus to it, and sending it to clients (probably we might want to add a new extension textDocument/filestatus)
  • reuse some existing LSP methods (window/showMessage, window/logMessage), render TUStatus to one of these methods' params.
clangd/TUScheduler.h
60

They are two different states, you can think RunningAction means the AST worker starts processing a task (for example Update, GoToDefinition) and BuildingFile means building the AST (which is one possible step when processing the task).

In the current implementation, BuildingPreamble and BuildingFile are only emitted when AST worker processes the Update task (as we are mostly interested in TU states of ASTWorker::update); for other AST tasks, we just emit RunningAction which should be enough.

Given the following requests in the worker queue:
[Update]
[GoToDef]
[Update]
[CodeComplete]

statuses we emitted are

RunningAction(Update) BuildingPreamble BuildingFile
RunningAction(GoToDef)
RunningAction(Update) BuildingPreamble BuildingFile
RunningAction(GetCurrentPreamble)
Idle

64

Done.

ilya-biryukov added inline comments.Nov 30 2018, 5:49 AM
clangd/ClangdServer.h
49

LG, I believe the BuildingPreamble status in particular might be useful on the C++ API level for us.
Not sure showMessage or logMessage is a good idea, the first one will definitely be annoying if it'll be popping up all the time. Having a new method in the LSP LG.

clangd/TUScheduler.cpp
269

The comment duplicates the code, maybe remove it?

270

Is Status actually ever read from multiple threads?
I assume it's only accessed on the worker thread, right? I think we can leave out the locks around it and put beside other fields only accessed by the worker thread, i.e. DiagsWereReported, etc.

PS Sorry go going back and forth, I didn't think about it in the previous review iteration.

582

NIT: Maybe change Stop emitting to Do not emit?
The current wording seems to suggest we're gonna signal some other code to stop doing that...

639

Maybe do it outside the lock? The less dependencies between the locks we have, the better and this one seems unnecessary.

clangd/TUScheduler.h
57

Maybe remove this value and leave out the default constructor too?
That would ensure we never fill have State field with undefined value without having this enumerator.
And FWIW, putting any value as the default seems fine even if we want to keep the default ctor, all the users have to set the State anyway.

60

That's the confusing part: clangd might be building the ASTs inside RunningAction too, but we only choose to report BuildingFile during updates.
I would get rid of BuildingFile and change it to RunningAction(Diagnostics) instead, it feels like that would allow avoiding some confusion in the future.

66

Ah, the annoying part about member initializers :-(
(Just grumbling, no need to do anything about it)

unittests/clangd/TUSchedulerTests.cpp
703

Using a single matcher with two params could improve readability, e.g.

ElementsAre(TUState(TUAction::Queued, ""), TUState(TUAction::RunningAction, "Update"), ...)

Feel free to leave as is if you like the current one better, of course. Just a suggestion.

hokein updated this revision to Diff 176111.Nov 30 2018, 6:41 AM
hokein marked 12 inline comments as done.

Address comments and fix a bug in the test.

hokein marked an inline comment as not done.Nov 30 2018, 6:43 AM
hokein added inline comments.
clangd/ClangdServer.h
49

showMessage seems work well via vim YCM, but it is annoying in vscode (as it pops up diags) -- we need to customize this behavior in our own extensions (showing messages in status bar). We need to revisit different ways.

clangd/TUScheduler.cpp
270

Unfortunately not, most statuses are emitted via worker thread, but we emit Queued status in the main thread...

639

Is it safe to read the Requests out of the lock here?

clangd/TUScheduler.h
57

We need the default constructor... When ASTWorker is constructed, we don't know the any state, I'm nervous about putting any value as default value (it will hurt the code readability, future readers might be confused -- why the status is Queued when constructing).

Adding an Unknown seems ok to me, it is a common practice...

60

BuildingFile might be more natural (understandable) than Diagnostics. Two possible ways here:

  • we can also report BuildingFile for other AST tasks, does it reduce the confusing part?
  • or if you think Diagnostic is an important state of Update, how about adding a new state RunningDiagnostic? RunningAction is a higher level information IMO, clangd is running diagnostic inside RunningAction(Update).

WDYT?

66

Yeap, if we use C++14, we can get rid of it.

unittests/clangd/TUSchedulerTests.cpp
724

Good suggestion, indeed it found a bug in the previous code...

Sorry for the delays. Not sure about emitting Queued on the main thread, see the corresponding comment. Otherwise looks very good.

clangd/ClangdServer.h
49

I'd err on the side of getting a new LSP method/extension that's supposed to show messages in the status bar.
It seems showMessage was specifically designed for user interactions and not status bar updates. But that's out of the scope of this patch, so definitely not blocking it.

clangd/TUScheduler.cpp
270

Hm, that sounds surprising.
What if the rebuild is in progress on the worker thread and we emit the queued status on the main thread at the same time? We might get weird sequences of statuses, right?
E.g. Queued → BuildingPreamble → [new update on the main thread] Queued → [processing request on a worker thread] RunningAction('XRefs')

The Queued request status between BuildingPreamble and RunningAction looks really weird.
Maybe we could avoid that by setting emitting the Queued status on the worker thread whenever we're about to sleep on the debounce timeout?

639

No, but we could store the value of .empty() in a local var under the lock. The code looks a bit weirder, but that's a small price to pay for less locks being held at the same time.

clangd/TUScheduler.h
57

Idle would be the valid state for an ASTWorker with an empty queue. Why not use it instead?

Can't disagree using Unknown is a common practice, but is it a good one? It forces every caller to handle this value, even though it's semantics are always unclear.
Why not use Optional<EnumWithoutUnkown> if I really want to indicate the lack of value in those cases? Otherwise every enum type implements its own version of Optional.

60

If you look at the TUScheduler as a black box, diagnostics are not that much different from other actions reading the AST.
I'd err on the side of not sending updates unless something interesting is actually going on.
What we're currently doing makes sense (sending a single update per action), I think it doesn't really matter if we call it BuildingFile or RunningAction("Diagnostics") or RunningDiagnostics. Pick the one you like.

Let's put a comment on BuildingFile that it's only emitted when building the AST for diagnostics and the read actions emit RunningAction instead

hokein updated this revision to Diff 176588.Dec 4 2018, 3:09 AM
hokein marked 6 inline comments as done.

Address review comments

  • remove Unknown enum type
  • make TUState only accessed by the worker thread.
hokein added inline comments.Dec 4 2018, 3:10 AM
clangd/TUScheduler.cpp
270

This sounds fair enough, and can simplify the code.

Maybe we could avoid that by setting emitting the Queued status on the worker thread whenever we're about to sleep on the debounce timeout?

The Queued state is originally designed for the state that the worker thread is blocked by the max-concurrent-threads semaphore.
Emitting it when we're about to sleep on the debounce timeout doesn't seem a right place to me. I think a reasonable place is before requiring the Barrier.

639

SG.

clangd/TUScheduler.h
57

Idle seems a compromising state for initialization.

60

Done. added comment.

ilya-biryukov accepted this revision.Dec 5 2018, 10:12 AM

LGTM, still have a few NITs and suggestions (see the comment about emitting "queued" on waiting for diagnostics and not emitting queued if the lock was acquired from the first try), but they don't look like something that should block this change.

clangd/TUScheduler.cpp
270

Ah, good point, waiting on a barrier can definitely take some time and is a form of queuing that we might want to notify the clients about.
It would be nice to notify only when the Barrier couldn't be acquired right away though, this probably be achieved adding try_lock() on the semaphore and only emitting a Queued notification if it fails. That way we'll avoid spamming the clients with non-useful statuses.

Wrt to deciding on whether we should notify when waiting on a debounce timer or on acquiring a semaphore... Why not both?

422

Should we also emit this status if we reuse the AST, but still report the diagnostics?

626

See the other suggestion about emitting this status only when locking the barrier failed.
This might be a good fit for a different patch, though, maybe add a FIXME here?

clangd/TUScheduler.h
75

NIT: start with a capital letter to be consistent with the rest of the codebase.

77

NIT: start with a capital letter to be consistent with the rest of the codebase.

This revision is now accepted and ready to land.Dec 5 2018, 10:12 AM
hokein updated this revision to Diff 176936.Dec 6 2018, 1:25 AM
hokein marked 8 inline comments as done.

Address comments.

hokein edited the summary of this revision. (Show Details)Dec 6 2018, 1:26 AM
hokein added inline comments.
clangd/TUScheduler.cpp
270

Sounds good.

422

Yes, I missed that. And I think we also should emit a status if buildAST fails.

626

Agree, added a fixme, will do it in a follow-up patch.

hokein added a comment.Dec 6 2018, 2:23 AM

Forgot to associate this patch to the actual commit, committed in rL348475.

hokein closed this revision.Dec 6 2018, 2:23 AM