This is an archive of the discontinued LLVM Phabricator instance.

[clangd] Cache preambles of closed files
AbandonedPublic

Authored by qchateau on Dec 28 2020, 1:51 PM.

Details

Summary

When a file is closed, push its preamble to a LRU cache
When a file is opened, try to get the preamble from the LRU cache.

By default store 10 preambles if they are stored on
memory and 1000 if they are stored on disk. That value
can be modified with --keep-preambles=N

Diff Detail

Event Timeline

qchateau created this revision.Dec 28 2020, 1:51 PM
qchateau requested review of this revision.Dec 28 2020, 1:51 PM
qchateau updated this revision to Diff 314152.Dec 30 2020, 3:50 PM
  • Fix keep preamble command line option
  • Fix tests

New year's ping ?

nridge added a subscriber: nridge.Jan 10 2021, 2:54 PM

Thanks! This is a really good idea, but not without its risks :-) Sorry for not getting to this for a while!

My only *really* high-level design question is: I wonder what the tradeoffs are in keeping just the preamble vs the whole ASTWorker. I'd *expect* this approach gets almost all the benefit without complicating the basic assumptions around ASTWorker, so it's probably the right way to go. Haven't thought about it much though...


I've got a bit of cognitive load because there's a few preamble-related ideas we've had, and I'm trying to work out how they interact. Let's start by writing them down...

  1. a *persistent* cache so closing+reopening clangd loses less state. (This is complicated because only the PCH is easily serializable, the rest of the PreambleData struct isn't)
  2. building caches of preambles while background-indexing (this would be great for modules but is probably way too big for whole preambles)
  3. reusing the "wrong" preamble initially when you open a new file, to give some basic functionality (using existing preamble patching logic, just in a more aggressive scenario)
  4. having the disk-based storage unlink the file preemptively, to eliminate any chance of leaking the *.pch

1&2 are both big projects, so I'm not really worried about the overlap or at least not going to try to predict how it'd be resolved.

3 is actually made *easier* by this patch I think: having a central repository of preambles makes it more natural to grab some "other" preamble.

4 I think doesn't really interact at all: the PreambleData object would own a file descriptor instead of a file, but it's all much the same.

So I think in summary there's not really any conflict to resolve with these ideas. cc @kadircet though who's done more thinking about #1 and #3 than I have.


I think we need to be fairly careful about policies around this cache. Preambles are large (we frequently see several hundred MB). Some workflows involve opening many files at a time. Some workflows end up running multiple copies of clangd on the same files. Some configurations keep them in memory rather than on disk. So a too-large cache could waste quite a lot of resources.

So, some pointy questions:

  • landing so close to the 12 release, should we conservatively default this to 0, and require opt-in?
  • is MB or #preambles a better limit to the cache size?
  • should we take size into account when deciding what to evict? (my guess is no, cost scales with size, and value scales with size * probability of reuse, so we should purely optimize for probability of reuse)
  • can we do better than LRU? The cache is accessed so infrequently and misses are so horrendously expensive that we could certainly affort to e.g. track usage history of every file ever opened, if it would help performance and not add too much complexity.
  • not a question, but I can say for sure that 1000 with no size limit isn't a safe default for disk :-(
clang-tools-extra/clangd/TUScheduler.cpp
193

we might instead consider keeping "active" preambles in the cache, and simply considering their cost to be 0/ineligible for eviction if the shared_ptr::usage_count > 1.
This allows this cache to be a "registry" so we can try using a preamble from a different TU as mentioned above.

(but this could also be done later, or could be done with a separate table of weak_ptrs. No change needed for this patch, just thinking out loud)

Hey, don't worry about the delay, I won't have as much time on my hands anymore anyway.

  • we can definitely make this opt-in
  • MB instead of # is an idea, it's probably closer to what the user want to configure - if he does - but it would also probably give a worse default value. Your call !
  • I mean why not, but this diff was more about bringing the feature in than making it perfect. The heuristic to decide how to evict preambles can be made very complicated if we want to optimize it
  • we can do as you suggest, it related to the previous point
  • Ah ! What would you recommend ? No forget it, I'll just make it opt-it for now

I've also tried to keep the ASTWorkers alive in a different branch. The result is very different: ASTWorkers always use RAM so we can't keep as many, but keeping them alive makes is even better (15s with no cache, 2s with preamble cache, virtually 0s with ASTWorker cache). I'd say both features are nice but they give different results for a different cost.
Also it would be great to stop the ASTWorker threads when they are just in cache, but the class needs a rework to be able to stop/restart it

  1. a *persistent* cache so closing+reopening clangd loses less state. (This is complicated because only the PCH is easily serializable, the rest of the PreambleData struct isn't)
  2. building caches of preambles while background-indexing (this would be great for modules but is probably way too big for whole preambles)
  3. reusing the "wrong" preamble initially when you open a new file, to give some basic functionality (using existing preamble patching logic, just in a more aggressive scenario)
  4. having the disk-based storage unlink the file preemptively, to eliminate any chance of leaking the *.pch

It feels like a mixture of 1 and 3 is going to provide the most value for decreasing time until semantic features (but I might be a little biased :D, also we might hit a nice sweet spot with pseudoparsing too).
I don't think having a cache for previously built preambles will ever be enough. As Sam pointed out, scaling is one of the biggest problems, as I don't think it would be feasible to have tens of preambles lying around on the disk, especially when they are costly the built (as it implies increased size).
Surely it optimizes the case of users working on a small set of files but frequently closes and re-opens them. But that's just one use case, it is also quite common to open tens of library headers while investigating an issue, or trying to understand details of some code through chains of go-to-definitions.
Users won't have any preambles for a while on those files and even after building the preamble they'll just be sitting in the cache probably only to be evicted.

So I think having a cache of preambles while optimizing for reusability (by keeping a small set of preambles that cover different set of files, as we can't use a preamble for a source file if it covers the source file in question) and then patching those to be applicable for current file at hand sounds like a better compromise. Surely it won't be as effective for frequent close/re-open use case, but I think the costs of such a cache isn't justified if it is only applicable to a single workflow.

As for mixing idea-1 into the equation, all of these will require clangd to do the work from scratch per instance, if we can have some sort of persistent on-disk cache, we can both share the work (and associated storage costs) across clangd instances and ensure clangd is also responsive even at startup without requiring user to build a bunch of preambles with every new clangd instance first.

qchateau abandoned this revision.Nov 4 2021, 12:48 PM
Herald added a project: Restricted Project. · View Herald TranscriptNov 4 2021, 12:48 PM