We now provide an abstraction of Scheduler that abstracts threading
and resource management in ClangdServer.
No changes to behavior are intended with an exception of changed error
messages.
This patch is preliminary work to allow a revamped threading
implementation that will move the threading code out of CppFile.
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
clangd/ClangdServer.h | ||
---|---|---|
107 ↗ | (On Diff #130163) | What's so simple about it? Why not clangd::ThreadPool? Also there's llvm::ThreadPool, what's the difference between them? |
clangd/ClangdServer.h | ||
---|---|---|
107 ↗ | (On Diff #130163) | Will rename it to ThreadPool.
I'll also be adding per-unit queues in the latter commit (aka thread affinity) to our thead pool, so it'll have more differences. I suggest waiting a day or two before I send the patch for review. Another minor difference is:
|
So, I simultaneously think this is basically ready to land, and I want substantial changes :-)
This is much better already than what we have, and where I think we can further improve the design, this is a natural point on the way.
My preferred next steps would be:
- we discuss a little bit the directions I've outlined below, without the assumption that they belong in this patch.
- to the extent you agree, you make the less-invasive changes here (e.g. adding a VersionConstraint API, but only actually supporting the cases you've implemented)
- once we're on the same page for the other stuff, I'm happy to pick up any amount of it myself - whatever's not going to step on your toes
clangd/ClangdServer.cpp | ||
---|---|---|
37 ↗ | (On Diff #130163) | Would be nice to have parallel names to the Scheduler methods, e.g. blockingASTRead() and blockingPreambleRead() |
222 ↗ | (On Diff #130163) | Having trouble with this one. |
clangd/ClangdServer.h | ||
164 ↗ | (On Diff #130163) | This is a nice abstraction, so much better than dealing with Cppfile! A couple of observations:
This seems like it would make it easier to explain what the semantics of scheduleRemove are, too.
One benefit here is that the versioning relationship between inputs and outputs no longer show up in the function signatures (by coupling an input to a matching output). Expressing them as data makes it easier to tweak them.
I think it would *probably* be clearer to have versions minted by the external DraftStore, that way we can decouple "we know about the contents of this file" from "we're building this file". e.g. we probably want wildly different policies for discarding resources of old versions, when "resources" = source code vs "resources" = ASTs and preambles.
|
169 ↗ | (On Diff #130163) | this is a performance optimization, right? |
173 ↗ | (On Diff #130163) | This seems like a smell - the command is sometimes an input from callsites that want builds, and sometimes an output to callsites that want builds. Also this class is all about threading, scheduling, and versions, and this method bypasses all of that. (If this is a hack to keep code complete working while we refactor, that's fine - but the docs should say that and we should know what next steps look like!) ISTM the issue here is coupling updates to the source with updates to the compile command. One option is:
|
184 ↗ | (On Diff #130163) | The callback here is a bit confusing and YAGNI (both clangd and our known embedders discard the returned future). |
200 ↗ | (On Diff #130163) | can we pass in a consistency policy here? even if we only support a subset of {ast, preamble} x policy for now, a richer API would enable experiments/tradeoffs later. e.g. enum class VersionConstraint { ANY, // returns the first data that's available, typically immediately EQUAL, // returns the data for the previous update GREATER_EQUAL, // Returns the data for the previous update, or some subsequent update. } (I guess the semantics you've described for AST read are EQUAL, and ANY for preamble read) |
206 ↗ | (On Diff #130163) | these two names seem a bit confusing - might be easier as just CppFiles, Threads? The names seem to be ~synonyms for the types, which I don't think is better than echoing the types. (Often there's a completely separate good name, but I don't think so here) |
- Renamed SimpleThreadPool to ThreadPool
- Removed PCHs from Scheduler's constructor
- Renamed waitFor(AST|Preamble)Action to blocking(AST|Preamble)Read
- Updates
- Remove getLastCommand from Scheduler's interface, pass Inputs to actions instead
clangd/ClangdServer.cpp | ||
---|---|---|
222 ↗ | (On Diff #130163) | Yes. Used your wording, thanks. |
clangd/ClangdServer.h | ||
169 ↗ | (On Diff #130163) | It is probably redundant, yes. Removed it. |
173 ↗ | (On Diff #130163) | I've opted for passing action inputs to the read methods, as discussed offline. The implementation turned out to be ugly, we now store a separate StringMap<ParseInputs> in Scheduler to keep track of the latest inputs and not put that into CppFile. |
184 ↗ | (On Diff #130163) | I'm fine with removing the callback, but I'd like this change to focus on internals of ClangdServer without changing its interface, happy to remove the callback afterwards. |
200 ↗ | (On Diff #130163) | I would rather not extend the API beyond what it can do at the moment, but happy to explore this approach after threading gets rid of the legacy stuff that we have now. |
206 ↗ | (On Diff #130163) |
|
clangd/ClangdServer.h | ||
---|---|---|
164 ↗ | (On Diff #130163) |
scheduleUpdate(File, Inputs) --> get(File).update(Inputs) That adds some complexity to the interface, though. I'd opt for not doing that in the initial version.
|
Agreed we can defer lots of stuff in order to keep this patch compact.
Generally the things I think we can get right before landing:
- names and file organization for new things
- documentation including where we want to get to
clangd/ClangdServer.cpp | ||
---|---|---|
246 ↗ | (On Diff #131105) | I think you want to take a reference here, and then capture by value |
323 ↗ | (On Diff #131105) | this fixme doesn't make sense if we're removing the callback |
345 ↗ | (On Diff #131105) | nit: you dropped a move here |
400 ↗ | (On Diff #131105) | nit: "Inp" in InpPreamble is pretty awkward. In this context, Read might work? or IP. |
403 ↗ | (On Diff #131105) | InpPreamble->Preamble->Preamble is pretty hard to understand. Can we find better names for these things? Or at least unpack it on one line? |
clangd/ClangdServer.h | ||
164 ↗ | (On Diff #130163) |
It should probably happen before the new implementation though, so it pushes that impl in the right direction.
Agreed that we should keep ClangdServer's interface the same for this patch, so we can't change this yet. FIXME? (If you agree with the testing strategy, that'd be worth mentioning)
|
184 ↗ | (On Diff #130163) | Sounds good. ScheduleUpdate -> update, though? Since this callback will go away, it's not really observable that anything is being scheduled. |
200 ↗ | (On Diff #130163) | I'm happy for this not to land in this patch, but with some of the other API changes it should go in before/with the new implementation |
206 ↗ | (On Diff #130163) | executor often names an abstraction that's related to but not the same as a threadpool (c.f. java or google3). Not a big deal, but it doesn't seem to add any extra semantics over "threads" which would be less ambiguous. |
165 ↗ | (On Diff #131105) | I'm not sure about so many little structs, vs providing one with some fields being optional. |
177 ↗ | (On Diff #131105) | This class has important responsibilities beyond threading itself, which "Scheduler" suggests. I can't think of a perfectly coherent name, options that seem reasonable:
|
177 ↗ | (On Diff #131105) | Worth saying something abouth the threading properties here:
|
195 ↗ | (On Diff #131105) | similarly scheduleRemove -> remove |
202 ↗ | (On Diff #131105) | I like "schedule" in the description, but it seems to in-the-weeds for a function name - ideally the verb would be what the caller is trying to do, not how we implement it. runWithAST? or if you want to emphasize the async nature, runSoonWithAST? |
216 ↗ | (On Diff #131105) | duplicate private |
clangd/ClangdServer.cpp | ||
---|---|---|
246 ↗ | (On Diff #131105) | Makes sense, less copies. Thanks. |
400 ↗ | (On Diff #131105) | Used IP |
403 ↗ | (On Diff #131105) | It's now IP->PreambleData->Preamble. Not great, but should make more sense. |
clangd/ClangdServer.h | ||
164 ↗ | (On Diff #130163) |
|
184 ↗ | (On Diff #130163) | update seems good. |
165 ↗ | (On Diff #131105) | Small structs seem better to me. I'd argue they give more type safety, even if it means more typing. |
177 ↗ | (On Diff #131105) | Added comments. |
177 ↗ | (On Diff #131105) | It'd be nice to have some mention of the fact that the class handles threading responsibilities. None of the options seem to capture this. |
202 ↗ | (On Diff #131105) | runWithAST sounds good. Thanks |
clangd/ClangdServer.h | ||
---|---|---|
177 ↗ | (On Diff #131105) | Rename of the Scheduler seems to be the only thing blocking this patch from landing. @sammccall, what option would you prefer? |
clangd/ClangdServer.h | ||
---|---|---|
177 ↗ | (On Diff #131105) | For a descriptive name I think we need something TU-related in there. If we want it to cover threads too, I can't think of anything better than TUScheduler. I'm also happy with Workshop or similar, which prods you to work out what the class actually does. Maybe other people will find this confusing. Will solicit opinions. ... Chatted with @ioeric and @hokein - Workshop is too silly. Suggestions were along the lines of what we've been discussing: TUTaskManager or similar. |
Can you please remove the threading/ subdirectory?
It seems premature for these two files, and TUScheduler doesn't fit. It's unclear that there will be more.
I'd suggest renaming Threadpool.h -> Threading.h, CancellationFlag might fit in there, though up to you.
clangd/threading/TUScheduler.h | ||
---|---|---|
1 ↗ | (On Diff #131926) | this class needs tests |
clangd/threading/ThreadPool.h | ||
1 ↗ | (On Diff #131926) | this class needs tests |
- Remove threading/ dir, moved everything to the top-level
- Rename ThreadPool.h to Threading.h
As discussed offline, basically the only thing to do for testing ThreadPool is to bash it with a workload, and some sort of whole-program stress test seems ideal for this and will also give some coverage to other components (and we should run under tsan!).
TUScheduler on the other hand is a big important class with a clear interface and contract, and is about to get a new implementation - testcases verifying the contracts will be extremely valuable.
These tests don't really need to be concurrency-heavy I think.
clangd/TUScheduler.h | ||
---|---|---|
10 ↗ | (On Diff #131941) | header guards are stale, sorry! |
clangd/Threading.h | ||
10 ↗ | (On Diff #131941) | this one also stale |
- Properly ignore errors.
- Run all requests to completion when destroying ThreadPool.
- Added simple tests for TUScheduler.
- Fixed include guards.
All comments should be addressed now. Let me know if I missed anything else.
clangd/threading/TUScheduler.h | ||
---|---|---|
1 ↗ | (On Diff #131926) | Added a simple test for it. Please take a look and let me know if you have more ideas on how we should test it. |
clangd/threading/ThreadPool.h | ||
1 ↗ | (On Diff #131926) | We have ClangdThreadingTest.StressTest and TUSchedulerTests that both run concurrent operations on ThreadPool. |
Addressed last review comments:
- Rename blockingRead to blockingRun
- Added a comment to ThreadPool's destructor
clangd/TUScheduler.h | ||
---|---|---|
23 ↗ | (On Diff #132100) | It can return 0, which will cause clangd to run synchronously. This function is only called when someone wants to have at least one worker thread for async processing. We can change it if you want, but I'd rather leave it as is in this patch. |